PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

how does this newbie fix his code?
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
cablesb



Joined: 21 Jan 2010
Posts: 33

PostPosted: Mon Mar 26, 2012 4:38 pm    Post subject: Reply with quote

whoops! In the host code, the "ceiling" text should read
Code:

ceiling(real(simlength*2)/tPB),tPB,simlength*8*128

Don't know what happened to my cut and paste there.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Tue Mar 27, 2012 7:57 am    Post subject: Reply with quote

Hi cablesb,

Quote:
Don't know what happened to my cut and paste there.
Problem with our UF. It wants to reformat what it thinks are HTML tags. I'll look into getting it fixed.

I see one major issue and a couple minor. First, CUDA Fortran global subroutines must have an interface in order to call them. Failure to have an interface will cause your kernels to fails and is most likely what's happening here. To fix, add an interface block for the global routines or put them in a module. Modules provide an implicit interface.

Next, since you don't use automatic or assumed-size shared arrays in your kernels, there is no need in passing in the third argument in the chevron syntax. Also, your block size is only 2, tPB, which is small. Typically you want block sizes in increments of the warp size (i.e. 32).

With these changes, I see that shift in the resulting temp files.

- Mat
Back to top
View user's profile
cablesb



Joined: 21 Jan 2010
Posts: 33

PostPosted: Tue Mar 27, 2012 10:07 am    Post subject: Reply with quote

Thanks, Mat.

The module seems to have done the trick. Have to admit, every example I've seen of a global subroutine has a modules, but I never saw it documented that you have to use modules, so I thought I would do it quick and dirty. Serves me right, I guess.

Re: the bytes argument in the chevrons and the low value of threads per block, I was trying all kinds of things to get the code to work.

I've looked around Amazon etc. for a good CUDA FORTRAN introduction. The closest I could find was a book for CUDA C that looks OK. Is there a good CUDA FORTRAN book around?

Thanks again!
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Tue Mar 27, 2012 1:23 pm    Post subject: Reply with quote

Quote:
Is there a good CUDA FORTRAN book around?
Greg Ruetsch and Massimiliano Fatica from NVIDIA are in the process of writing one. Here's link to an early version http://corsi.cineca.it/courses/scuolaAvanzata/Massimiliano%20Fatica/Book-Fatica.pdf. Greg just sent me an updated version that I can let you have. Send a note to PGI Customer service (trs@pgroup.com) and I'll have them send you a copy.

Greg said that it still needs some work, especially with the Multi-GPU chapter, but should be helpful.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group