Joined: 19 Jan 2010
|Posted: Fri Sep 23, 2011 11:48 am Post subject:
|1. Can Fortran whole array assignment and where construct utilize OpenMP features?
Yes, but only within the OpenMP workshare construct, such as:
a(:) = b(:) + c(:)
where(a(:).lt.0) a(:) = 0
!$omp end workshare
2. Does the forall construct perform better than do loop?
Typically, no. The forall statement and construct essentially explicitly state that the right hand side(s) can be computed in parallel, and all the assignments to the left hand side(s) can be done in parallel, but they don't state that the right hand side is actually independent of the left hand side. This means the compiler has to use a temporary array to hold the right hand side values, then assign from the temp array to the actual left hand side. An important optimization is to remove that temp array, but it can't always be done.
3. Is there any performance difference between allocatable and pointer in PGI implementation?
Yes. Allocatables have two important characteristics: First, unless they have the target attribute, the compiler knows there are no aliases with other variables. Even with the target attribute, the compiler knows there are no aliases with other allocatables or non-pointer variables or arrays. This allows better alias analysis and hence better optimization; an example is when optimizing away the temp array for forall assignments, from the previous question. Second, the compiler knows how the array was allocated, so it knows that it's contiguous and stride-1 on the leftmost dimension. Pointer arrays could have been pointer assigned from any array section, so they are not necessarily contiguous or stride-1.
4. Does the PGI Fortran compiler do alias analysis so that it can tell the region pointed by two pointers are not overlapped?
It does some, yes, even interprocedurally, but it's not as successful as we would hope.