CUDA Fortran Kernel Double Precision Array

OpenACC and CUDA Fortran
Post Reply
Sam Murdoch
Posts: 1
Joined: Dec 05 2018

CUDA Fortran Kernel Double Precision Array

Post by Sam Murdoch » Wed Sep 11, 2019 2:51 am

Hi,

I'm using a NVIDIA GeForce RTX 2080 Ti card.

I have a CUDA Fortran kernel into which a number of DOUBLE PRECISION arrays are passed.
The kernel calculates a value based on these for the given thread index (I), then attempts to store it in an array, R, declared:
DOUBLE PRECISION, INTENT(OUT) :: R(:)
For illustration purposes, the calculation is

Code: Select all

(a(I)*b(I)) + c + d + e + f + g + h
When I do

Code: Select all

WRITE(*,*) (a(I)*b(I)) + c + d + e + f + g + h 
in the kernel, I can see the value of the term correctly is 4.4408920985006262E-016

When I set:

Code: Select all

R(I) = (a(I)*b(I)) + c + d + e + f + g + h
then
WRITE(*,*) R(I)
The value of R(I) is zero.

I know that the values are on the boundaries of machine precision so this must be significant. If I explicitly set R(I) to some small constant, for example

Code: Select all

R(I) = 5
Then everything works as expected, so I don't believe there is anything wrong with the process of calling and returning values from the kernel.

Is there a precision limitation that applies or compiler flags that I could be missing?

Any help here would be greatly appreciated.

Thanks

mkcolg
Posts: 8048
Joined: Jun 30 2004

Re: CUDA Fortran Kernel Double Precision Array

Post by mkcolg » Wed Sep 11, 2019 9:12 am

Hi Sam,
Is there a precision limitation that applies or compiler flags that I could be missing?
FMA is enabled by default in device code, but I highly doubt that if would cause this issue. Though, you can try compiling with "-Mnofma" to disable this.

It doesn't quite make sense why printing "R(I)" would differ from printing the computation directly. The compiler would need to generate a temp variable to hold the result before printing, which shouldn't be different than if it were stored to R.

Can you post or send to PGI Customer Service (trs@pgroup.com) a reproducing example? That would help to determine what's going on.

Thanks,
Mat

Post Reply