PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

OpenACC: How to CACHE into GPU shared memory?

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
_sayan_



Joined: 07 Apr 2012
Posts: 29

PostPosted: Thu Aug 02, 2012 4:39 pm    Post subject: OpenACC: How to CACHE into GPU shared memory? Reply with quote

Hi,

My code is that of an 26 pt isotropic stencil, I want to pre-fetch some values into GPU shared memory, typically i-1:i+4 and j-1:j+4. I am not able to do this, and get a warning like:

Code:

PGF90-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): multiple indices in shared memory dimension (kernel.f90: 416)


or
Code:

PGF90-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): unknown shared array size (kernel.f90: 617)

when I suppress a dimension.
My code structure is as follows:

Code:

  !$ACC KERNELS                      &
  !$ACC PRESENT(p0,q0,phi,eta,roc2)
  !$ACC LOOP INDEPENDENT
  do k=k0,k1
   !$ACC LOOP INDEPENDENT
   do j=j0,j1
    !$ACC LOOP INDEPENDENT
    do i=i0,i1
     !$ACC CACHE(p0(...),q0(...))

Perhaps this is a bad idea to make the cache construct execute so many times, I want to get some idea as to how it could be used efficiently in my case.

Thank you very much,
Sayan

UPDATE:

This is working when I specify the cache construct as:
Code:

!$ACC CACHE(p0(i-1:i+4,j-1:j+4,k-1:k+1), q0(i-1:i+4,j-1:j+4,k-1:k+1))


compilation info:
Code:

        423, Cached references to size [(x+5)x6x(y+2)] block of 'q0'
             Cached references to size [(x+5)x6x(y+2)] block of 'p0'


Now I get this warning instead:
Code:

PGF90-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): illegal opcode (kernel.f90: 416)

Code structure is the same as above. But the problem is that the code is terribly slow, I would need to change the loop mapping, any ideas welcome.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Fri Aug 03, 2012 1:53 pm    Post subject: Reply with quote

Hi Sayan,

Quote:
PGF90-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): illegal opcode (kernel.f90: 416)
This is actually a compiler error but we haven't had any other reports of it yet. If you can send us a reproducing example that would be great, otherwise, I'll see if I can. With the cache clause so new unfortunately there's bound to be problems.

Quote:
But the problem is that the code is terribly slow,
The error above is most likely preventing the kernel from being generated so could be the cause.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group