PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Using the cache directive

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mmikalsen



Joined: 30 Aug 2012
Posts: 3

PostPosted: Sun Jun 09, 2013 9:00 am    Post subject: Using the cache directive Reply with quote

Hi


I have a SOR Poisson solver kernel that i want to try to use the cache directive on. Im am however a little bit unclear how and where to use it. Was wonder if anyone could give me some suggestions.

I am using the deviceptr clause because data is allocated using acc_malloc, and the mask value represents obstacles in the volume.

Also wondering what would be the optimal gang vector configuration when working on a 3D grid such as 128x32x128 (x,y,z). As default the compiler partitions the two inner loops (x,z) across gangs an the inner loop (x) across vectors.

Thanks for any suggestions.

Code:


#define I(X,Y,Z) ((X) + (Z)*dim.x + (Y)*dim.z*dim.x)

void solve_poisson(float *p, float *p0, float *b, int *obs, float *poisson_tab, dim_3 dim, float w) {
#pragma acc kernels deviceptr(p,p0,obs,b,poisson_tab) copyin(w)   
{     
    #pragma acc loop independent
    for(int y = 1; y < dim.y-1; ++y) {
        #pragma acc loop independent
        for(int z = 1; z < dim.z-1; ++z) {
            #pragma acc loop independent
            for(int x = 1; x < dim.x-1; ++x) {

                int mask = obs[I(x,y,z)] & 127;

                if ((~mask & 126) && (mask & VOX_SELF) == 0) {
                    float res = 0.0f;
                    res += p0[I(x-1,y,z)] * (float)((mask & VOX_LEFT) == 0);
                    res += p0[I(x+1,y,z)] * (float)((mask & VOX_RIGHT) == 0);
                    res += p0[I(x,y-1,z)] * (float)((mask & VOX_BELOW) == 0);
                    res += p0[I(x,y+1,z)] * (float)((mask & VOX_ABOVE) == 0);
                    res += p0[I(x,y,z-1)] * (float)((mask & VOX_UP) == 0);
                    res += p0[I(x,y,z+1)] * (float)((mask & VOX_DOWN) == 0);
                    res -= b[I(x,y,z)];
                    res *= poisson_tab[mask>>1];
                    res *= w;
                    res += p0[I(x,y,z)]*(1.0f - w);
                    p[I(x,y,z)] = res;
                }
                else {
                    p[I(x,y,z)] = 0.0f;
                }
            }
        }
    }
}
}


Last edited by mmikalsen on Mon Jun 10, 2013 8:29 pm; edited 1 time in total
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6120
Location: The Portland Group Inc.

PostPosted: Mon Jun 10, 2013 9:34 am    Post subject: Reply with quote

Hi mmikalsen,

For the "cache" directive, the syntax would be something like:
Code:
            #pragma acc loop independent
            for(int x = 1; x < dim.x-1; ++x) {
            #pragma acc cache(p0[x:128])

Unfortunately, it can be tricky to use and the compiler doesn't always accept it depending upon the array and it's layout. We're working on it and should have better support in the future. Though, the compiler typically does do a good job utilizing shared memory so often the cache directive isn't needed.

More often than not, the compiler does find the optimal schedule and in this case it may be. However, I would have thought the default would be gang, gang vector, vector. Though, you can over ride this and experiment.

Code:
    #pragma acc loop gang independent
    for(int y = 1; y < dim.y-1; ++y) {
        #pragma acc loop gang vector independent
        for(int z = 1; z < dim.z-1; ++z) {
            #pragma acc loop vector independent
            for(int x = 1; x < dim.x-1; ++x) {


Hope this helps,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group