PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

parallelize inner loop

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
WENYANG LIU



Joined: 26 Sep 2010
Posts: 11

PostPosted: Thu Oct 14, 2010 5:48 am    Post subject: parallelize inner loop Reply with quote

Hi,
I am trying to parallelize only the inner loop as following:

Code:

program innerloop

implicit none
integer::i,j,f(3),f_j(10)

 
f=0


!$acc data region local(f_j),copy(f)

do i=1,3
!$acc region
   do j=1,10
      f_j(j)=i*j
   end do

   f(i)=sum(f_j)
!$acc end region     
end do


!$acc end data region


write(*,*)f

end program


However, the output of "f" is "0 0 0". While the correct one should be " 55 110 165".
Can anyone point out my mistake? The pgfortran version is 10.3.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Thu Oct 14, 2010 4:05 pm    Post subject: Reply with quote

Hi WENYANG LIU,

This looks like it may be more of a compiler issue. I have sent a report to our engineers (TPR#17286) for further investigation.

The work around is to either remove the "copy(f)" clause from the data region directive or modify the code as follows:
Code:
% cat tmp2.f90
program innerloop

implicit none
integer::i,j
real :: f(3),f_j(10)
 
f=0

!$acc region local(f_j)
!$acc do host
do i=1,3
   do j=1,10
      f_j(j)=i*j
   end do
   f(i)=sum(f_j)
end do
!$acc end region     

write(*,*)f

end program
% pgf90 -ta=nvidia -Minfo -V10.3 tmp2.f90 ; a.out
innerloop:
      9, Generating local(f_j(:))
         Generating compute capability 1.0 kernel
         Generating compute capability 1.3 kernel
     11, Parallelization would require privatization of array 'f_j(1:10)'
         Sequential loop scheduled on host
     12, Loop is parallelizable
         Accelerator kernel generated
         12, !$acc do parallel, vector(10)
     15, sum reduction inlined
         Loop is parallelizable
         Accelerator kernel generated
         15, !$acc do parallel, vector(10)
             Sum reduction generated for f_j$r
    55.00000        110.0000        165.0000   

Thanks,
Mat
Back to top
View user's profile
WENYANG LIU



Joined: 26 Sep 2010
Posts: 11

PostPosted: Thu Oct 14, 2010 6:42 pm    Post subject: Reply with quote

Hi Mkcolg,

Thanks for your reply.
I have a question regarding the modified code you provided:
Since "i-loop" is on host, why is Sum reduction generated for f_j on line 15?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Fri Oct 15, 2010 2:09 pm    Post subject: Reply with quote

Hi WENYANG LIU,

Only the outer i loop is scheduled on the host. The two inner loops (sum is really a loop) are accelerated with the compiler generating a kernel for each. Do you not want the sum reduction parallelized?

- Mat
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Thu Jul 14, 2011 11:28 am    Post subject: Reply with quote

Hi WENYANG LIU,

TPR#17286 was fixed in the 11.6 release when we added support for scalar kernels. The problem here was that "f(i)=sum(f_j)" gets transformed into:
Code:
tmp = 0
do j = 1,10
tmp = tmp + f_j(j)
enddo
f(i) = tmp
While the do loop is scheduled on the device, the final update "f(i) = tmp" had to be performed on the host. However, since you copy back "f" at the end of the data region, the host values get overwritten.

Adding support for scalar kernels in 11.6 allows for "f(i) = tmp" to be executed on the device.

Thanks,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group