PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

How to parallel outer loop

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Teslalady



Joined: 16 Mar 2012
Posts: 75

PostPosted: Wed Mar 26, 2014 7:29 am    Post subject: How to parallel outer loop Reply with quote

Hi, a simple question,

please kindly the attached code:
Code:
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
if(A<B[j])
A=pow(B[j],2);
}
}


A and B are the same size of array, because the inner loop just change the value of A, and the value of B [j] is unchanged, so I want to use OPENACC to parallel outer loop, how to do?
Back to top
View user's profile
cparrott



Joined: 02 May 2011
Posts: 146

PostPosted: Wed Mar 26, 2014 12:48 pm    Post subject: Reply with quote

Hi,

You could add a directive like the following before the outer loop:

#pragma acc region

However, the compiler notes that there is a scalar dependency on the assignment of A inside the inner loop body, which is carried up to the outer loop as well:

main:
13, Generating present_or_copyin(B[:])
Generating NVIDIA code
14, Loop carried scalar dependence for 'A' at line 18
Accelerator scalar kernel generated
16, Loop carried scalar dependence for 'A' at line 18
Generated 1 prefetches in scalar loop

I'm not sure you could deterministically compute a value for A in a parallel computation due to this scalar dependency.

Hope this helps,

+chris
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Wed Apr 02, 2014 10:24 am    Post subject: Reply with quote

Hi Sisy,

Did you really mean for "A" to be an array? If so, then to just accelerate the outer loop, you can do something like:

Code:
#pragma acc kernels loop gang vector independent
for(i=0;i<n;i++)
 {
#prama acc loop seq
 for(j=0;j<n;j++)
 {
 if(A[i]<B[j])
 A[i]=pow(B[j],2);
 }
 }


"independent" may not be needed if you have specified A and B with the C99 "restrict" attribute.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group