PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Targeting CUDA and Two Processors

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Fri Aug 03, 2012 10:10 am    Post subject: Targeting CUDA and Two Processors Reply with quote

While this is probably an instance of my Make environment being too generic, I thought I'd ask here: is it possible to use -Mcuda with -tp=[two processors]?

For example, I was compiling our model with -Mcuda=fastmath,ptxinfo,4.1,cc20 as well as -tp=nehalem-64,sandybridge-64 and all sorts of errors appeared. So I found an old CUF kernel test module and tried compiling it:
Code:
(1042) > pgfortran -Mcuda=cc20,4.1 -Minfo=all -tp=nehalem-64,sandybridge-64 test.F90
madd_dev:
     13, PGI Unified Binary version for -tp=sandybridge-64
     20, CUDA kernel generated
         20, !$cuf kernel do <<< (*,*), (32,1) >>>
         22, Sum reduction generated for sum
madd_dev:
     13, PGI Unified Binary version for -tp=nehalem-64
     20, CUDA kernel generated
         20, !$cuf kernel do <<< (*,*), (32,1) >>>
         22, Sum reduction generated for sum
/gpfsm/dnb31/tdirs/login/dscvr17.535.mathomp4/pgcudaforVZldn2J1Bx4d.gpu(178): error: function "madd_dev_20_gpu" has already been defined

/gpfsm/dnb31/tdirs/login/dscvr17.535.mathomp4/pgcudaforVZldn2J1Bx4d.gpu(267): error: function "madd_dev_22_gpu_red" has already been defined

2 errors detected in the compilation of "/gpfsm/dnb31/tdirs/login/dscvr17.535.mathomp4/pgnvdH0ldJtW2HxiD.nv0".
PGF90-F-0000-Internal compiler error. Device compiler exited with error status code       0 (test.F90: 25)
PGF90/x86-64 Linux 12.6-0: compilation aborted

It looks like the GPU compiler tried to compile the GPU code twice and found it had done it once already!

Now, the obvious thing for me to do is to make sure I'm not doing the double tp when I'm doing GPU (and this will be tested soon), since my GPUs are only next to Westmeres at the moment.

But, in the future, it's possible there could be a time where I have GPUs on Westmeres and on Sandy Bridges, so this question would be useful to have an answer to: is there a way to have a tp-unified binary with GPUs?

Thanks,
Matt
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6119
Location: The Portland Group Inc.

PostPosted: Fri Aug 03, 2012 2:32 pm    Post subject: Reply with quote

Hi Matt,

Sorry but CUDA Fortran doesn't support the Unified Binary. In talking with Brent, it's something he wants to add but it would be for awhile.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group