PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

paralle + independent and kernels + vector_length()
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Mon Aug 20, 2012 9:31 am    Post subject: Reply with quote

Quote:
Hence, will there be the possibility to change the vector_length within a kernels region?
In a kernel region, the "loop vector" clause can accept a width, for example "vector(128)".

Quote:
Furthermore, do you know why the compiler schedules the workload among gangs and vectors?
I guess I'm not understanding the question. A "gang" corresponds to a CUDA Block while a "vector" corresponds to the threads within a block. The compiler would need to schedule both since this is how the thread execution model is organized on an NVIDIA device (See: http://www.pgroup.com/lit/articles/insider/v2n1a5.htm)

For Kernels, let's remove the "collapse" and explicitly schedule the second loop. You could also explicitly set the gang width as well.

Code:
#pragma acc kernels present(Ahat[0:n*k],x[0:k],tmpArray[0:n*numBlocksK])   
{   
#pragma acc loop independent gang
   for (int i=0; i<numBlocksN; i++) {
#pragma acc loop independent gang  // You can set the width here as well
       for(int j=0; j<numBlocksK; j++) {
#pragma acc loop independent vector(256)  // Vector length should be the same as BLOCK_SIZE
         for(int l = 0 ; l < BLOCK_SIZE ; ++l){
            precision tmp;
            tmp = 0.0;
#pragma unroll(UNROLL_SIZE)
            for(int m = 0 ; m < BLOCK_SIZE ; ++m){
               tmp += Ahat[(i*BLOCK_SIZE +l)* k + j*BLOCK_SIZE + m] * x[j*BLOCK_SIZE + m];
            }
            tmpArray[(i*BLOCK_SIZE + l ) * numBlocksK + j] += tmp;
         }
       } // for j
   } // for i
}


- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group