PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

unspecified launch failure

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
jand



Joined: 17 Aug 2008
Posts: 57

PostPosted: Mon Jun 04, 2012 4:53 pm    Post subject: unspecified launch failure Reply with quote

Hi,

I am getting the following error when running a coe:

Code:
 
./spher
    80.85433959960938        0.4611731770210439     
    89.73984527587891        0.4757429858514819     
 REF_NLAY4_KERNEL:            4
 unspecified launch failure                                                                                                     
 REF_SUM_KERNEL:
 unspecified launch failure                                                                                                     
 REF_PROD_KERNEL:
 unspecified launch failure                                                                                                     
0: copyout Memcpy (host=0x1362060, dev=0x200300000, size=65536) FAILED: 4(unspecified launch failure)


The first two output lines indicate that the code ran and gave correct results. I just put a loop around the same subroutine call and call it over and over again. Sometimes, without apparent reason, the code crashes. As above after two successful calls. Sometimes, it may run hundreds of times successfully before crashing.

When I compile the code in emulation mode, it appears to run fine for thousands of calls.

Any ideas why this might happen? It seems as if crashes occur more frequently when array sizes in the computation are large.

Any insight would be greatly appreciated.

Thanks, Jan
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6136
Location: The Portland Group Inc.

PostPosted: Tue Jun 05, 2012 9:52 am    Post subject: Reply with quote

Hi Jan,

Most likely you're kernels are getting memory access errors. Check for out-of-bounds errors or uninitialized memory reads/writes. Host code is much more forgiving when accessing out-of-bounds memory while device code will die in the same circumstance.

Sans debugger support (where working on it!), I will start commenting out portions of code in the kernel and/or use print statements to start narrowing the problem code. Also, I'll sometimes use temp arrays to hold intermediate values.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group