PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

declare dim3 type using %x,%y doesn't work

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Tuan



Joined: 11 Jun 2009
Posts: 233

PostPosted: Sun Jan 09, 2011 3:14 pm    Post subject: declare dim3 type using %x,%y doesn't work Reply with quote

It seems that CUDA Fortran code only work when I declare 2D/3D thread block using the following approach

Code:
type(dim3) :: dimGrid, dimBlock

 dimGrid = dim3( N/16, L/16, 1 )
 dimBlock = dim3( 16, 16, 1 )
 call mmul_kernel<<<dimGrid,dimBlock>>>( Adev,Bdev,Cdev,N,M,L )


I get runtime error if I use the C-like declaration

Code:
type(dim3) :: dimGrid, dimBlock

 dimGrid%x = N/16
 dimGrid%y = L/16
 dimGrid%z = 1
 dimBlock%x = 16
 dimBlock%y = 16
  dimBlock%z = 1
 call mmul_kernel<<<dimGrid,dimBlock>>>( Adev,Bdev,Cdev,N,M,L )


I think it should be okay to use either approach. Any idea?

Tuan
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6129
Location: The Portland Group Inc.

PostPosted: Mon Jan 10, 2011 2:45 pm    Post subject: Reply with quote

Hi Tuan,

Something else is going on since both methods work for me. Can you post a reproducer?

Go Ducks!
Mat

Example:
Code:
% cat test2.cuf

module testme
use cudafor

contains

attributes (global) subroutine mmul_kernel(A,N,L)
use cudafor
real, dimension(:,:) :: A
integer, value :: N,L
integer :: ix,iy

ix = threadidx%x + blockdim%x*(blockidx%x-1)
iy = threadidx%y + blockdim%y*(blockidx%y-1)
if (ix.le.N.and.iy.le.L) then
   A(ix,iy) = ix*iy
endif

end subroutine

end module testme

program test
use cudafor
use testme
real, dimension(:,:), allocatable, device :: Adev
real, dimension(:,:), allocatable :: A
integer :: N,L
type(dim3) :: dimGrid, dimBlock

N=64
L=64
allocate(Adev(N,L), A(N,L))

 dimGrid%x = N/16
 dimGrid%y = L/16
 dimGrid%z = 1
 dimBlock%x = 16
 dimBlock%y = 16
 dimBlock%z = 1
call mmul_kernel<<<dimGrid,dimBlock>>>( Adev,N,L )
A=Adev

print *, A(1,1), A(N,L)
end program test
% pgf90 test2.cuf -o test2.out -V11.0 -fast
% test2.out
    1.000000        4096.000 
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group