PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

some OpenMP performance questions

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Performance and Benchmarking
View previous topic :: View next topic  
Author Message
fgao



Joined: 28 Dec 2004
Posts: 6

PostPosted: Mon Aug 29, 2011 8:13 pm    Post subject: some OpenMP performance questions Reply with quote

1. Can the whole array assignment and where construct utilize OpenMP features?

2. Does the forall construct perform better than do loop?

3. Is there any performance difference between allocatable and pointer in PGI implementation?

4. Does the PGI fortran compiler do alias analysis so that it can tell the region pointed by two pointers are not overlapped?
Back to top
View user's profile
Michael Wolfe



Joined: 19 Jan 2010
Posts: 42

PostPosted: Fri Sep 23, 2011 11:48 am    Post subject: Reply with quote

1. Can Fortran whole array assignment and where construct utilize OpenMP features?

Yes, but only within the OpenMP workshare construct, such as:
Code:
!$omp workshare
      a(:) = b(:) + c(:)
      where(a(:).lt.0) a(:) = 0
!$omp end workshare


2. Does the forall construct perform better than do loop?

Typically, no. The forall statement and construct essentially explicitly state that the right hand side(s) can be computed in parallel, and all the assignments to the left hand side(s) can be done in parallel, but they don't state that the right hand side is actually independent of the left hand side. This means the compiler has to use a temporary array to hold the right hand side values, then assign from the temp array to the actual left hand side. An important optimization is to remove that temp array, but it can't always be done.

3. Is there any performance difference between allocatable and pointer in PGI implementation?

Yes. Allocatables have two important characteristics: First, unless they have the target attribute, the compiler knows there are no aliases with other variables. Even with the target attribute, the compiler knows there are no aliases with other allocatables or non-pointer variables or arrays. This allows better alias analysis and hence better optimization; an example is when optimizing away the temp array for forall assignments, from the previous question. Second, the compiler knows how the array was allocated, so it knows that it's contiguous and stride-1 on the leftmost dimension. Pointer arrays could have been pointer assigned from any array section, so they are not necessarily contiguous or stride-1.

4. Does the PGI Fortran compiler do alias analysis so that it can tell the region pointed by two pointers are not overlapped?

It does some, yes, even interprocedurally, but it's not as successful as we would hope.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Performance and Benchmarking All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group