I'm using a NVIDIA GeForce RTX 2080 Ti card.

I have a CUDA Fortran kernel into which a number of DOUBLE PRECISION arrays are passed.

The kernel calculates a value based on these for the given thread index (I), then attempts to store it in an array, R, declared:

For illustration purposes, the calculation isDOUBLE PRECISION, INTENT(OUT) :: R(:)

Code: Select all

`(a(I)*b(I)) + c + d + e + f + g + h`

Code: Select all

`WRITE(*,*) (a(I)*b(I)) + c + d + e + f + g + h `

When I set:

Code: Select all

`R(I) = (a(I)*b(I)) + c + d + e + f + g + h`

The value of R(I) is zero.WRITE(*,*) R(I)

I know that the values are on the boundaries of machine precision so this must be significant. If I explicitly set R(I) to some small constant, for example

Code: Select all

`R(I) = 5`

Is there a precision limitation that applies or compiler flags that I could be missing?

Any help here would be greatly appreciated.

Thanks