PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Course

Problem using LFSR random number generator in CUDA FORTRAN
Goto page Previous  1, 2, 3, 4
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
Peter Nightingale



Joined: 14 Oct 2010
Posts: 20

PostPosted: Tue Feb 22, 2011 6:47 am    Post subject: Reply with quote

In the two cases in which you get the correct result this simply happens because the compiler evaluates the shift at compile time. You can see the C code generated by the compiler by compiling with -Mcuda=keepgpu; then look for a file with extension gpu.

What seems to work is to define a local variable var1 in subroutine testshift, while you change d_test(3) = ISHFT(var,-1) to d_test(3) = ISHFT(var1,-1). That is probably also faster when you perform a couple of combined bit manipulation operations.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6667
Location: The Portland Group Inc.

PostPosted: Tue Feb 22, 2011 10:05 am    Post subject: Reply with quote

Hi Tom, Peter,

The reason why the example 'works' with "-ta=nvidia" is because this flag forces optimization level -O2. Using -O2 directly would have the same effect.

As Peter points out, for the construct "ISHFT(-4,-1)" will get evaluated on the host during compilation so results in a constant value 2147483646. For "ISHFT(var,-1)" at "-O2", the compiler is able to evaluate that var is constant, hence can replace the expression with a constant at compile time.

Note that our compiler engineers do have a fix in place that should be available in the next release (11.3).

Code:

% cat ishift.f90
module testshift

integer :: test ! host
integer, device :: d_test ! device

contains

attributes(global) subroutine testshft ()
integer :: var
var = -4
d_test = ISHFT(var,-1)
end subroutine testshft

end module testshift

program foo
use testshift

call testshft<<<1,1>>>() ! carry out ishft on gpu
test = d_test ! copy device result to host
print *, test ! print result
end program foo

% pgf90 -Mcuda=keepgpu ishift.f90   ! Using Pre-release 11.3 compiler
% a.out
   2147483646
% cat ishift.001.gpu
#include "cuda_runtime.h"
#include "pgi_cuda_runtime.h"
#include "ishift.001.h"
__device__ struct{
int m0;
}__align__(16) _testshift_16;
extern "C" __global__ void testshft()
{
int var;
var = -4;
_testshift_16.m0 = (unsigned int)var>>(1);    <<< Fix is to cast var to unsigned int
}


Thanks,
Mat
Back to top
View user's profile
tom.rb.edwards



Joined: 02 Dec 2010
Posts: 35

PostPosted: Mon Mar 21, 2011 3:52 am    Post subject: Workaround Reply with quote

I found a workaround.

I fixed the problem by replacing ISHFT with IBITS whenever a negative integer is shifted to the right (negative bit shift). Specifically, I Replace ISHFT(var,-x) with IBITS(var,x,32-x)

This is the old random number generator:
Code:
      b  = ishft(ieor(ishft(z1,6),z1),-13)
      z1 = ieor(ishft(iand(z1,-2),18),b)

      b  = ishft(ieor(ishft(z2,2),z2),-27)
      z2 = ieor(ishft(iand(z2,-8),2),b)

      b  = ishft(ieor(ishft(z3,13),z3),-21)
      z3 = ieor(ishft(iand(z3,-16),7),b)

      b  = ishft(ieor(ishft(z4,3),z4),-12)
      z4 = ieor(ishft(iand(z4,-128),13),b)

      rand=ishft( ieor(ieor(ieor(z1,z2),z3),z4) , -1)*4.656612873077d-10

This is the new random number generator:
Code:
    b  = ibits(ieor(ishft(z1,6),z1),13,19)
    z1 = ieor(ishft(iand(z1,-2),18),b)

    b  = ibits(ieor(ishft(z2,2),z2),27,5)
    z2 = ieor(ishft(iand(z2,-8),2),b)

    b  = ibits(ieor(ishft(z3,13),z3),21,11)
    z3 = ieor(ishft(iand(z3,-16),7),b)

    b  = ibits(ieor(ishft(z4,3),z4),12,20)
    z4 = ieor(ishft(iand(z4,-128),13),b)
   
    rand=ibits(ieor(ieor(ieor(z1,z2),z3),z4),1,31)*AM


The definition of IBITS can be found here: http://gcc.gnu.org/onlinedocs/gfortran/IBITS.html

Thanks for the help
Back to top
View user's profile
Peter Nightingale



Joined: 14 Oct 2010
Posts: 20

PostPosted: Mon Mar 21, 2011 11:02 am    Post subject: Reply with quote

According to the release notes, the shift problem has been solved in version 11.3; see http://www.pgroup.com/support/release_tprs_2011.htm (TPR 17689).
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page Previous  1, 2, 3, 4
Page 4 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group