PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

CUDA Fortran host code waits eternally.
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
DAVID-SPH



Joined: 23 May 2011
Posts: 28

PostPosted: Mon May 28, 2012 2:58 am    Post subject: CUDA Fortran host code waits eternally. Reply with quote

We are using CUDA Fortran for CFD (SPH) and we are having trouble at runtime (in some machines not in others).
Apparently the host code seems to wait eternally between kernels (we are using different kernels as natural synchronization barrier).
We get no cuda errors in the kernels (as I said in some machines the code executes flawlesly).
We are using only the defaul stream (stream "0" I believe), and adding the instruction cudaDeviceSynchronize() causes the host code to wait forever in that point, even when not using it at all it ends up stopping somewhere.

Has anybody suffered the same problem? honestly we are quite puzzled and cannot continue... as there are no compilation errors nor runtime errors it is impossible for us to fix this thing.

Best regards,
Back to top
View user's profile
brentl



Joined: 20 Jul 2004
Posts: 108

PostPosted: Tue May 29, 2012 6:07 pm    Post subject: Reply with quote

Can you give us any hints on the differences in the machines where it works and those where it doesn't? Related to either

1) Type of NVIDIA hardware on the different systems
2) Compute Capabilities on working and not
3) OS Versions - might be a long shot. All 64-bit OSes?
4) PGI Versions. Which versions are you using? Same binary on failing and working machines?
5) NVIDIA driver version
6) Is the hang in the same kernel (or type of kernel) everytime?
Back to top
View user's profile
DAVID-SPH



Joined: 23 May 2011
Posts: 28

PostPosted: Wed May 30, 2012 3:49 am    Post subject: differences are minimum Reply with quote

Well the differences are minimum:

- Same OS in all cases (Windows 7 64 bits).
- Same GPU architecture (Fermi) with Compute capabilities 2.0.
- Same PGI versions as we compile in one machine, then deploy and tests in several machines. we are using teh very latest 12.5
- We have all machines updated to latest nvidia drivers, the only difference is that one is a laptop and the other two desktops (301.27 and 301.32).
- Yes the program stops in same point for both machines that does. and the puzzling thing is that stops and waits BETWEEN kernels, as if waiting for synchronization.
- It only works in the laptop whith the only difference is that it has a less powerfull gpu and it is sharing the memory.
Back to top
View user's profile
brentl



Joined: 20 Jul 2004
Posts: 108

PostPosted: Wed May 30, 2012 2:54 pm    Post subject: Reply with quote

Is it possible for us to get the binary or source? I think we'll need to do some low-level digging. To my knowledge we haven't seen this behavior before. If sending us either source or binary is possible, mail it to trs@pgroup.com.
Back to top
View user's profile
DAVID-SPH



Joined: 23 May 2011
Posts: 28

PostPosted: Thu May 31, 2012 2:06 pm    Post subject: source code sent Reply with quote

I just sent the source code with some test data.
I attach some instructions as well.

thanks a lot.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group