PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Course

Three Dimensional Matrices
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
rotteweiler



Joined: 26 May 2010
Posts: 20

PostPosted: Thu Jul 01, 2010 7:41 am    Post subject: Three Dimensional Matrices Reply with quote

Hi everyone!

I am currently exploring how to use three dimensional matrices in the kernel but it is not as simple as I thought. According to the CUDA Fortran
Programming Guide and Reference the values of blockidx%z and griddim%z must always be one. Keeping constraints in mind, I know a block cannot have more than 512 threads. If the device will only accept two dimensional matrices, what would be the best way to take a three dimensional matrix and send it to the kernel?

This is the code I am trying to implement:

Code:

do m = 1, mba
   do j = 2, nj-1
      do i = 2, ni-1

         PHIN(i, j, m) = AN(i,j,m) * PHI(i,j+1,m)&
                         + AS(i,j,m) * PHI(i,j-1,m)&
                         + AE(i,j,m) * PHI(i,j+1,m)&
                         + AW(i,j,m) * PHI(i,j-1,m)&
                         + AP(i,j,m) * PHI(i,j+1,m)
      enddo
   enddo
enddo


Any feedback would be helpful since I am new to CUDA Fortran. Thank you for your time!

-Chris
Back to top
View user's profile
rmsivley



Joined: 28 May 2010
Posts: 25
Location: NASA Langley Research Center

PostPosted: Thu Jul 01, 2010 8:10 am    Post subject: Reply with quote

edit: used your variable names

I'm fairly new to cuda fortran as well, so take my advice with a grain of salt. However, would it be possible to declare a one dimensional dimGrid and a two dimensional dimBlock? You could then reference your i,j,k variables with:

i = threadidx%x
j= threadidx%y
m= blockidx%x

Or at least something similar. I'm planning on doing something similar for the project I'm working on. The way I see it, you're taking your 3D block and dividing it up into slices of 2D planes, referenced by 'm'. Each cuda block handles one slice, while your 2D array of threads handles each element of the plane, referenced by 'i' and 'j'.

I know this isn't the same approach the PGI guide takes in its example, so I'm right here with you wondering if this is a legit method.
Back to top
View user's profile
rotteweiler



Joined: 26 May 2010
Posts: 20

PostPosted: Thu Jul 01, 2010 9:03 am    Post subject: Reply with quote

Thank you!
I will work on it and see what constraints the device gives me.

Sincerely,

Chris
Back to top
View user's profile
rmsivley



Joined: 28 May 2010
Posts: 25
Location: NASA Langley Research Center

PostPosted: Thu Jul 01, 2010 9:13 am    Post subject: Reply with quote

Definitely let me know how it goes. Keep in mind with that technique the max (square) slice size is 22x22 to keep within the 512 thread limit.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6745
Location: The Portland Group Inc.

PostPosted: Thu Jul 01, 2010 3:24 pm    Post subject: Reply with quote

Hi Chris,

Assuming "ni" is large, I would make your "m" loop be correspond to "blockid%x" and "j" to "blockid%y". Next have your kernel execute multiple iterations of i. Something like:

Code:
 
m  = blockIdx%x-1
j  =  blockIdx%y  ! adjust for the starting value of 2
tidx = threadIdx%x+1  ! adjust for the starting value of 2
nthrds = blockDim%x

do i = tidx, ni-1, nthrds
         PHIN(i, j, m) = AN(i,j,m) * PHI(i,j+1,m)&
                         + AS(i,j,m) * PHI(i,j-1,m)&
                         + AE(i,j,m) * PHI(i,j+1,m)&
                         + AW(i,j,m) * PHI(i,j-1,m)&
                         + AP(i,j,m) * PHI(i,j+1,m)
      enddo
   enddo
enddo


And your kernel launch would be something like:
Code:
   
    type(dim3) :: dimGrid, dimBlock
...
   dimBlock = dim3(512,1,1)   
   dimGrid = dim3(mba,nj-2,1)
   call foo<<dimGrid,dimBlock>>>(arg1)


Hope this helps,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group