PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

cuda variable not updating
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
cablesb



Joined: 21 Jan 2010
Posts: 33

PostPosted: Mon May 21, 2012 2:21 pm    Post subject: Reply with quote

Thanks much for the advice. But today's not my day. I have decided to try something much simpler. I am trying to just take one element of my array and increment it with each iteration step:

Code:
module dimensions
integer, parameter :: nx=256, ny=256
end module


module diff_kernel_mod

contains


attributes(global) subroutine add_source(v,iloc,jloc)

real*8 :: v(:,:)
integer, value :: iloc, jloc

i=(blockIdx%x-1)*blockDim%x+threadIdx%x
j=(blockIdx%y-1)*blockDim%y+threadIdx%y

if (i==iloc .and. j==jloc) then
  v(i,j)=v(i,j)+1
endif

end subroutine

end module diff_kernel_mod


program diffuse


use cudafor
use dimensions
use diff_kernel_mod


real*8 :: v(nx,ny), diffconst
real*8, device :: v_d(nx,ny)
integer :: nloops, outputdl
type(dim3) :: grid, tBlock
real*8, device :: diffconst_d

nloops=100
outputdl=10
diffconst=.25

outputdl_d=outputdl
diffconst_d=diffconst

tBlock=dim3(64,64,1)
grid=dim3(ceiling(real(nx)/tBlock%x), &
          ceiling(real(ny)/tBlock%y), 1)

open(unit=11,file='diff_output.txt')

v=0.

n=0

do while (n.le.nloops)
  do m=1,outputdl
    v_d=v
    call add_source<<<grid>>>(v_d,64,64)
    v=v_d
  enddo
  n=n+outputdl
  write(11,*) v
  print *,n
enddo


end program

I expect v(64,64) to increase steadily. And, when I compile with cuda=emu, it does. But when I run it on the GPGPU for real, it stubbornly remains zero. If you could tell me what I'm doing wrong I would be much obliged.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Mon May 21, 2012 3:02 pm    Post subject: Reply with quote

Hi Cablesb,

Adding a bit of error handling shows that you're getting a "invalid configuration argument". The problem being that your block size is 64x64, or 4096 threads. The maximum number of threads per block will vary by device, but on my Tesla C2070 the max is 1024. Changing "64" to "16" fixes the problem. Use the utility "pgaccelinfo" to see the max for your device.

Code:
module dimensions
integer, parameter :: nx=256, ny=256
end module


module diff_kernel_mod

contains


attributes(global) subroutine add_source(v,iloc,jloc)
implicit none
real*8 :: v(:,:)
integer, value :: iloc, jloc
integer :: i,j

i=(blockIdx%x-1)*blockDim%x+threadIdx%x
j=(blockIdx%y-1)*blockDim%y+threadIdx%y

if (i==iloc .and. j==jloc) then
  v(i,j)=v(i,j)+1
endif

end subroutine

end module diff_kernel_mod


program diffuse


use cudafor
use dimensions
use diff_kernel_mod
implicit none


real*8 :: v(nx,ny), diffconst
real*8, device :: v_d(nx,ny)
integer :: nloops, outputdl,n,m,ierr
type(dim3) :: grid, tBlock

nloops=1
outputdl=10
diffconst=.25

tBlock=dim3(16,16,1)
grid=dim3(ceiling(real(nx)/tBlock%x), &
          ceiling(real(ny)/tBlock%y), 1)
print *, 'Block:', tBlock
print *, 'Grid: ', grid
open(unit=11,file='diff_output.txt')

v=0.
n=0

do while (n.le.nloops)
  do m=1,outputdl
    v_d=v
    call add_source{{{grid,tBlock}}}(v_d,16,16)
    ierr=cudaGetLastError()
    if (ierr.ne.0) then
       print *, 'ERROR:', cudaGetErrorString(ierr)
    endif
    v=v_d
  enddo
  n=n+outputdl
  write(11,*) v
  print *,n
enddo

end program


Hope this helps,
Mat
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Tue May 22, 2012 11:23 am    Post subject: Reply with quote

Quote:
I think this has to do with the HTML parser or something with the Forum. Try substituting, say, {{{ }}} instead in your paste.
I think I have this fixed now. Looks like I just needed to turn off HTML support. I watch for other issues.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group