PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

invalid device function
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 6213
Location: The Portland Group Inc.

PostPosted: Mon Oct 03, 2011 1:30 pm    Post subject: Reply with quote

Hi Mike,

"cudaDeviceReset" is new in CUDA 4.0 and why you're getting an undefined external error. Try again with "-Mcuda=cuda4.0". By default CUDA 3.2 is used.
Quote:

Im sorry I didnt realise that you couldnt use -ta=nvidia and -Mcuda at the same time.
You can but you had mismatch settings.

Besides adding "cudaDeviceReset", have you made other changes to the Matmul example?

- Mat
Back to top
View user's profile
mcoffey



Joined: 26 Mar 2011
Posts: 16

PostPosted: Mon Oct 03, 2011 2:05 pm    Post subject: invalid device function Reply with quote

Mat, Im very grateful for your help with this - Ive been at it for weeks and getting nowhere!

When I include the following code I get the error 42

cuda_info=cudaDeviceGetLimit(retVal,cudaLimitStackSize)
print *, "cudaLimitStackSize:", cudaGetErrorString(cuda_info)
print *, "cudaLimitStackSize:", retVal

as well in the line

Cdev = Csub(1:N,1:L)

When I comment out the 3 lines I get no error in both places - it seems the call to cudaDeviceGetLimit causes errors elsewhere.
Ive have adapted the mmul routine to be generic and multiply large matrices in blocks. My intention is to make a matmul routine that wil run on any CUDA card with any amount of memory. We need to multiply then invert matrices of 20,000 * 50,000

Im still getting erro 8 invalid device function
Mike
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6213
Location: The Portland Group Inc.

PostPosted: Mon Oct 03, 2011 3:02 pm    Post subject: Reply with quote

Hi Mike,

I tested "cudaDeviceGetLimit" on my laptop and get the same error. It also fails on a CC1.3 device but succeeds on a CC2.0. So the error seems to be expected on your device.

Quote:

Im still getting erro 8 invalid device function
Can you post a reproducing example or send the full source to PGI Customer Service (trs@pgroup.com)? Probably easiest if I just look at the code instead of guessing.

- Mat
Back to top
View user's profile
mcoffey



Joined: 26 Mar 2011
Posts: 16

PostPosted: Tue Oct 04, 2011 1:50 am    Post subject: invalid device function Reply with quote

OK Mat will do and thanks very much for your help - its driving me crazy! Ill email the source and a little of the data

mike
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6213
Location: The Portland Group Inc.

PostPosted: Tue Oct 04, 2011 2:18 pm    Post subject: Reply with quote

Mike and corresponded via email and determined that the invalid device function error was due to him having a double precision value in his kernel. CC11 devices don't support double precision.

In addition to this, he was also using dynamic shared memory (automatics) in his kernel but failed to add the shared memory size at the kernel launch. This was causing his kernel to crash. To fix, needed to either use fixed size shared memory arrays or add the shared memory size at the kernel launch.
Code:
call mmul_kernel  dimGrid,dimBlock,(BLOCK_SIZE*BLOCK_SIZE*8)   ( Adev, Bdev, Cdev, N, M, L )


- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group