PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

How can I avoid this crash?

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mos



Joined: 05 May 2011
Posts: 1

PostPosted: Thu May 12, 2011 2:54 am    Post subject: How can I avoid this crash? Reply with quote

I have the following loop in my program which is crashing w/o any kind of error message (so no out-of-memory-error or anything like that):

Code:
SUBROUTINE iterate_wo( td )! (p1,p2,q1,q2)->(p1',p2',q1',q2')
   USE params
   USE accel_lib
   USE precision
   USE data_arr

   IMPLICIT NONE
   INTEGER(KIND=8) :: i,j,td
   REAL(fp_kind),DIMENSION(4) :: xloc

   !$acc data region copy( x )
   !$acc region
   !$acc do parallel private( xloc )
   DO i=1,niniconds*niniconds
      xloc(:)=x(:,i)
      !$acc do seq
      DO j=1,td
         xloc(1)=         xloc(1)+k1*(1+e*COS(xloc(4)))*SIN(xloc(3))
         xloc(2)=         xloc(2)+k2*(1+e*COS(xloc(3)))*SIN(xloc(4))
         xloc(3)=xloc(3)+xloc(1)
         xloc(4)=xloc(4)+xloc(2)
      END DO
      x(:,i)=xloc(:)
   END DO
   !$acc end region
   !$acc end data region

END SUBROUTINE iterate_wo


x is defined as
Code:
REAL(fp_kind), ALLOCATABLE,  DIMENSION(:,:) :: x

in the data_arr module and allocated like this:
Code:
n=niniconds*niniconds
ALLOCATE( x(4,n) )

where niniconds=256. td is usually quite large (around 10^7).

Now I assume that for each thread of the parallel loop a complete copy of x is adressed in the device memory. Is that true, and if so, how could I change the loop to use less memory?

Relevant compiler output:
Code:
iterate_wo:
     85, Generating copy(x(:,:))
     88, Loop is parallelizable
     89, Loop is parallelizable
         Accelerator kernel generated
         88, !$acc do parallel ! blockidx%y
         89, !$acc do parallel, vector(4) ! blockidx%x threadidx%x
     91, Complex loop carried dependence of 'xloc' prevents parallelization
         Loop carried reuse of 'xloc' prevents parallelization
         Accelerator kernel generated
         88, !$acc do parallel, vector(32) ! blockidx%x threadidx%x
         91, !$acc do seq
             Non-stride-1 accesses for array 'xloc'
     97, Loop is parallelizable
         Accelerator kernel generated
         88, !$acc do parallel ! blockidx%y
         97, !$acc do parallel, vector(256) ! blockidx%x threadidx%x


Best regards,
mos

EDIT: fp_kind is defined like this:
Code:
integer, parameter :: Double = kind(0.0d0) ! Double precision
integer, parameter :: fp_kind = Double
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Thu May 12, 2011 12:15 pm    Post subject: Reply with quote

Hi Mos,

My best guess as to the problem is that as you have it defined, you are creating three kernels. You really want one so that the same private xloc is used. As it is now, each kernel uses a different xloc. Try changing your directives to the following so that only the i loop is parallelized. The "kernel" directive tells the compiler to use the body of the loop as the device kernel.
Code:

   !$acc region
   !$acc do parallel, vector(256), kernel, private( xloc )
   DO i=1,niniconds*niniconds
      xloc(:)=x(:,i)
      DO j=1,td
         xloc(1)=         xloc(1)+k1*(1+e*COS(xloc(4)))*SIN(xloc(3))
         xloc(2)=         xloc(2)+k2*(1+e*COS(xloc(3)))*SIN(xloc(4))
         xloc(3)=xloc(3)+xloc(1)
         xloc(4)=xloc(4)+xloc(2)
      END DO
      x(:,i)=xloc(:)
   END DO
   !$acc end region


Note, try changing the vector size to see which size gives the best performance.

Hope this helps,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group