PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Accelerator restriction: unknown right hand side reference ?
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Alistair Hart



Joined: 06 Jul 2010
Posts: 21
Location: Cray Exascale Research Initiative, Edinburgh

PostPosted: Fri Jul 30, 2010 5:49 am    Post subject: Accelerator restriction: unknown right hand side reference ? Reply with quote

Hi,

I am accelerating the region below, where all the variables are pointers passed into a subprogram. They are not qualified as being restricted although I know they are for this loop so I use the independent clause:

When I compile with pgcc 10.6-0 64-bit target on x86-64 Linux -tp nehalem-64, I get:

Code:

     91, Generating copyin(txz[ioXz+(ioXx*n1)-1:nz+(n1*nx)+2])
         Generating copyin(txx[ioXz+((ioXx-1)*n1):nz+(n1*(nx+1))])
         Generating copyin(rox[ioXz+(ioXx*n1):nz+(n1*nx)])
         Generating copy(vx[ioXz+(ioXx*n1):nz+(n1*nx)])
         Generating compute capability 2.0 binary
     97, Loop is parallelizable
         Accelerator kernel generated
         97, #pragma acc for parallel, vector(256)
             CC 2.0 : 10 registers; 4 shared, 128 constant, 0 local memory bytes; 100 occupancy
     99, Loop is parallelizable
    104, Accelerator restriction: unknown right hand side reference


What does the final message mean, please, and how do I solve the problem?

Cheers,

Alistair.


Code:

#pragma acc region \
  copy(  vx [ioXx*n1+ioXz    :nx*n1+nz]) \
  copyin(rox[ioXx*n1+ioXz    :nx*n1+nz]) \
  copyin(txx[(ioXx-1)*n1+ioXz:(nx+1)*n1+nz]) \
  copyin(txz[ioXx*n1+ioXz-1  :nx*n1+nz+2])
#pragma acc for independent
        for (ix=ioXx; ix<nx+1; ix++) {
#pragma acc for independent
                for (iz=ioXz; iz<nz+1; iz++) {
                        vx[ix*n1+iz] += rox[ix*n1+iz]*(
                             c1*(txx[ix*n1+iz]     - txx[(ix-1)*n1+iz] +
                                 txz[ix*n1+iz+1]   - txz[ix*n1+iz])    +
                             c2*(txx[(ix+1)*n1+iz] - txx[(ix-2)*n1+iz] +
                                 txz[ix*n1+iz+2]   - txz[ix*n1+iz-1])  );
                }
        }
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6215
Location: The Portland Group Inc.

PostPosted: Wed Aug 04, 2010 3:25 pm    Post subject: Reply with quote

Hi Alistair,

Sorry for the late response. I wasn't sure since it the first time I've seen this message and needed some input for our compiler engineers.

This message indicates that the compiler thinks one or more of your arrays (vx, txx, txz, rox) is volatile. Would you be able to post a reproducing example or post how the arrays as well as the c1, c2, nz, nx, and n1 variable's are declared?

Thanks,
Mat
Back to top
View user's profile
Alistair Hart



Joined: 06 Jul 2010
Posts: 21
Location: Cray Exascale Research Initiative, Edinburgh

PostPosted: Tue Aug 10, 2010 5:48 am    Post subject: Reply with quote

Thanks. Below is a (non-functioning) version of the code which compiles with the same error for line 33. Strangely, the second loop nest does not generate the same message, despite looking functionally similar.

I needed the "acc for independent" directives on both loops to get acceleration.

Cheers,

Alistair.

Code:
struct modPar { int nx, nz, naz, iorder; };

int e4(struct modPar mod,
       float *vx, float *vz,
       float *tzz, float *txx, float *txz,
       float *rox, float *roz)
{

  int ix, iz, nx, nz, n1, ioXx, ioXz, ioZz, ioZx;

  const float c1 = 9.0/8.0, c2 = -1.0/24.0;
  nx  = mod.nx;
  nz  = mod.nz;
  n1  = mod.naz;

  ioXx=mod.iorder/2;
  ioXz=ioXx-1;
  ioZz=mod.iorder/2;
  ioZx=ioZz-1;

  int lmin = ioXx*n1+ioXz;
  int lmax = nx*n1+nz;
#pragma acc region copy(vx[lmin:lmax]) copyin(rox[lmin:lmax])   \
  copyin(txx[lmin-2*n1:lmax+n1]) copyin(txz[lmin-1:lmax+2])
#pragma acc for independent
  for (ix=ioXx; ix<nx+1; ix++) {
#pragma acc for independent
    for (iz=ioXz; iz<nz+1; iz++) {
      vx[ix*n1+iz] += rox[ix*n1+iz]*(
                      c1*(txx[ix*n1+iz]     - txx[(ix-1)*n1+iz] +
                          txz[ix*n1+iz+1]   - txz[ix*n1+iz])    +
                      c2*(txx[(ix+1)*n1+iz] - txx[(ix-2)*n1+iz] +
                          txz[ix*n1+iz+2]   - txz[ix*n1+iz-1])  );
    }
  }

  int lmin2 = ioZx*n1+ioZz;
  int lmax2 = nx*n1+nz;
#pragma acc region copy(vz[lmin2:lmax2]) copyin(roz[lmin2:lmax2])       \
  copyin(tzz[lmin2-2 :lmax2+1]) copyin(txz[lmin2-n1:lmax2+2*n1])
#pragma acc for independent
  for (ix=ioZx; ix<nx+1; ix++) {
#pragma acc for independent
    for (iz=ioZz; iz<nz+1; iz++) {
      vz[ix*n1+iz] += roz[ix*n1+iz]*(
                      c1*(tzz[ix*n1+iz]     - tzz[ix*n1+iz-1] +
                          txz[(ix+1)*n1+iz] - txz[ix*n1+iz])  +
                      c2*(tzz[ix*n1+iz+1]   - tzz[ix*n1+iz-2] +
                          txz[(ix+2)*n1+iz] - txz[(ix-1)*n1+iz])  );
    }
  }
}

void main(){
  struct modPar mod;
  float *vx, *vz, *tzz, *txx, *txz, *rox, *roz;
  e4(mod,vx,vz,tzz,txx,txz,rox,roz);
}



Code:

41$ pgcc -V

pgcc 10.6-0 64-bit target on x86-64 Linux -tp nehalem-64
Copyright 1989-2000, The Portland Group, Inc.  All Rights Reserved.
Copyright 2000-2010, STMicroelectronics, Inc.  All Rights Reserved.
42$ pgcc -ta=nvidia:cc20 -Minfo tb_test.c
e4:
     23, Generating copyin(txz[lmin-1:lmax+2])
         Generating copyin(txx[lmin-(n1*2):n1+lmax])
         Generating copyin(rox[lmin:lmax])
         Generating copy(vx[lmin:lmax])
         Generating compute capability 2.0 binary
     26, Loop is parallelizable
         Accelerator kernel generated
         26, #pragma acc for parallel, vector(256)
             CC 2.0 : 10 registers; 4 shared, 120 constant, 0 local memory bytes; 100 occupancy
     28, Loop is parallelizable
     33, Accelerator restriction: unknown right hand side reference
     39, Generating copyin(txz[lmin2-n1:(n1*2)+lmax2])
         Generating copyin(tzz[lmin2-2:lmax2+1])
         Generating copyin(roz[lmin2:lmax2])
         Generating copy(vz[lmin2:lmax2])
         Generating compute capability 2.0 binary
     42, Loop is parallelizable
         Accelerator kernel generated
         42, #pragma acc for parallel, vector(256)
             CC 2.0 : 22 registers; 4 shared, 112 constant, 0 local memory bytes; 83 occupancy
     44, Loop is parallelizable

Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6215
Location: The Portland Group Inc.

PostPosted: Tue Aug 10, 2010 8:32 am    Post subject: Reply with quote

Hi Alistair,

Thank you for the example code. I'll sent it to our compiler engineers for further study but it appears to me to be a transient compiler issue. The message does not appear when compiling with the 10.5 release nor the upcoming 10.8 release.

Quote:
I needed the "acc for independent" directives on both loops to get acceleration.
This is fine and a good use of the directive. The compiler is unable to determine at compile time that all elements of the vx and vz arrays are independent since their index is computed.

Best Regards,
Mat
Back to top
View user's profile
Alistair Hart



Joined: 06 Jul 2010
Posts: 21
Location: Cray Exascale Research Initiative, Edinburgh

PostPosted: Wed Aug 11, 2010 3:45 am    Post subject: Reply with quote

Thanks for the quick diagnosis. I'll keep waiting for 10.8.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group