PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Device kernel error (are maths operations the problem?)
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Torkin



Joined: 18 Apr 2012
Posts: 29

PostPosted: Sat Apr 26, 2014 10:21 am    Post subject: Device kernel error (are maths operations the problem?) Reply with quote

Hello

I have a question about placing log and atan commands in the device kernel subroutine.

Is it possible to have these two mathematical functions as is possible in normal subroutines processed by the cpu?

In this code I am interested in a(i,j) and I return it to the host in the cpu subroutine. However, this kernel is not working and I suspected the maths operation (log, atan, abs)

This is how I called the device:
Code:

       call CoefficientDevice<<<1 ,dim3(20,20,1)>>>( adev, ndev, xdev, ydev, bcdev, nodedev, dnormdev )
       istat = cudathreadsynchronize()

       print *, 'Device Done'
       pause

       print *, 'changing back to a from adev'
       pause
       call system_clock( count=c33 )
       a = adev
       write(6666,*) a
       pause


and unfortunately I get this error when 'changing back to a from adev':
0: copyout Memcpy (host=0x88253f0, dev=0x200000, size=1600) FAILED: 30(unknown error)

my GPU is GT525, 1gb ram

I would really appreciate the help

Ahmed


Last edited by Torkin on Mon Apr 28, 2014 5:14 pm; edited 1 time in total
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Mon Apr 28, 2014 8:49 am    Post subject: Reply with quote

Hi Ahmed,

Why do you think "log" and "atan2" is the problem? It's possible but I would think that something else is going on.

Note that the Memcpy error is most likely due to the kernel error. Unless you add error checking after your kernel, the error wont become evident until the next time the device is accessed (like memcpy).

I don't see anything obviously wrong in your code, but I would start with:

Code:
al   = sqrt((y(1,node(2,j))-y(1,node(1,j)))**2 +(y(2,node(2,j))-y(2,node(1,j)))**2)

Are values of "node" always between 1 and n?

- Mat
Back to top
View user's profile
Torkin



Joined: 18 Apr 2012
Posts: 29

PostPosted: Mon Apr 28, 2014 9:06 am    Post subject: Reply with quote

Hey Mat,

Thank you for the reply.

I have just read that "log" and "atan2" are computable on the GPU. I wonder if there is a fault with the way I am using these functions.

Quote:
Are values of "node" always between 1 and n?


Yes and it is an array of 2xn size and all values of node(2,n) are integers and they are transferred from the host to the GPU.

Is there a problem with calling such functions or are there restrictions to my case? Does abs() work on the GPU too?

For your information, I compiled using with and without -Mcuda=fastmath and both result in the same error (while running the exe file).

- Ahmed
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Mon Apr 28, 2014 10:29 am    Post subject: Reply with quote

[quote]Is there a problem with calling such functions or are there restrictions to my case? Does abs() work on the GPU too?[\quote]These intrinsics are supported in CUDA Fortran for the data types listed in Chapter 3 of the CUDA Fortran Programming Guide. If there was a problem with how you're using these intrinsics, I would expect the results to include 'NaNs' rather than cause a kernel error.

What is the return code from your kernel?

Code:
    call CoefficientDevice<<<1 ,dim3(20,20,1)>>>( adev, ndev, xdev, ydev, bcdev, nodedev, dnormdev )
      print *, cudaGetErrorString(cudaGetLastError())


I did just notice that you have a syncthreads which due to the if statement wont get executed by all threads. This can cause problems. What happens if you comment it out? (It appears to be extraneous).

- Mat
Back to top
View user's profile
Torkin



Joined: 18 Apr 2012
Posts: 29

PostPosted: Mon Apr 28, 2014 1:27 pm    Post subject: Reply with quote

Hey Mat

Quote:
What is the return code from your kernel?

Code:
call CoefficientDevice<<<1 ,dim3(20,20,1)>>>( adev, ndev, xdev, ydev, bcdev, nodedev, dnormdev )
print *, cudaGetErrorString(cudaGetLastError())


I tried it and it printed out 'no error'. Any ideas?

- Ahmed
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group