PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

OpenACC cache directive issues

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
ValeriuCodreanu45554



Joined: 27 Nov 2012
Posts: 2

PostPosted: Mon Feb 11, 2013 6:56 am    Post subject: OpenACC cache directive issues Reply with quote

Hello,

I have some problems using the OpenACC cache directive with the PGI compiler.

Code:
   int iters = 0 ;
#pragma acc data copy(Uold), copyin(rhs), create(Unew)
{
  while (iters < max_iters && l2_norm > 1e-9) {
    ++iters;
    /* update each interior point */
#pragma acc kernels loop independent
    for (k=1; k<= n; k++){   
#pragma acc loop independent
      for (j=1; j<= n; j++){
#pragma acc loop independent
         for (i=1; i<= n; i++) {
#pragma acc cache(rhs[k-1:k+1][j-1:j+1][i-1:i+1])
//#pragma acc cache(Uold[k-1:k+1][j-1:j+1][i-1:i+1])
     Unew[k][j][i] = factor*(rhs[k][j][i]
                                  +factor2*(Uold[k][j][i-1]+Uold[k][j][i+1]
                   +Uold[k][j-1][i]+Uold[k][j+1][i]
                   +Uold[k+1][j][i]+Uold[k-1][j][i])
              +Uold[k-1][j-1][i]+Uold[k-1][j+1][i]
              +Uold[k-1][j][i-1]+Uold[k-1][j][i+1]
              +Uold[k][j-1][i-1]+Uold[k][j+1][i-1]
              +Uold[k][j-1][i+1]+Uold[k][j+1][i+1]
              +Uold[k+1][j-1][i]+Uold[k+1][j+1][i]
              +Uold[k+1][j][i-1]+Uold[k+1][j][i+1]);
         }
      }      
   }
/* pointer swap */
   REAL*** tmp;
   REAL*** p_old = Uold;
   REAL*** p_new = Unew;
   tmp = p_old; p_old= p_new; p_new = tmp;
   nIters = iters;
   }
}


If I uncomment the second #pragma acc cache, the compilation outputs
Code:
PGC-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unexpected load/store type (openacc_poisson.c: 149)
main:
    140, getTime inlined, size=6, file openacc_poisson.c (67)
    144, Generating create(Unew[0:][0:][0:])
         Generating copyin(rhs[0:][0:][0:])
         Generating copy(Uold[0:][0:][0:])
    150, Loop is parallelizable
    152, Loop is parallelizable
    154, Loop is parallelizable
         Accelerator kernel generated
        150, #pragma acc loop vector(4) /* threadIdx.y */
        152, #pragma acc loop gang /* blockIdx.y */
             Cached references to size [(y+2)x3x(x+2)] block of 'rhs'
             Cached references to size [(y+2)x3x(x+2)] block of 'Uold'
        154, #pragma acc loop gang, vector(64) /* blockIdx.x threadIdx.x */
    178, getTime inlined, size=6, file openacc_poisson.c (67)
PGC/x86-64 Linux 13.1-1: compilation completed with warnings


It seems that PGI compiler has a problem with the j+1 in my Uold array. If I remove the loads from Uold[][j+1][] the compilation works, but of course the program does not execute correctly.

What do you think I should try?

Thank you,
Valeriu
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6211
Location: The Portland Group Inc.

PostPosted: Mon Feb 11, 2013 10:56 am    Post subject: Reply with quote

Hi Valeriu,

Quote:
PGC-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unexpected load/store type (openacc_poisson.c: 149)
This is most likely a compiler error. Can you please send a reproducing case to PGI Customer Support (trs@pgroup.com)?

Quote:
but of course the program does not execute correctly.
Caching shouldn't effect correct execution, so this may be caused by something else. Have you tested your program without OpenACC enabled? You're pointer swapping doesn't look correct to me since at Uold and Unew, are never swapped.

- Mat
Back to top
View user's profile
ValeriuCodreanu45554



Joined: 27 Nov 2012
Posts: 2

PostPosted: Wed Feb 13, 2013 3:59 am    Post subject: Reply with quote

Hi Mat,

Thanks for the reply. I've sent the full example to the Customer Support.
I was saying that if I removed certain computation from my for loops,(i.e. the accesses to the [j+1] components of the Uold array), then the output is wrong because of this algorithmic change.

The pointer swap I think is done correctly,

Best,
Vali
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group