PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Question about data movement as seen from compiler feedback
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Thu Jan 24, 2013 2:24 pm    Post subject: Reply with quote

Hi Ping,

Could you post an example of the Minfo output as well as a reproducing example? This will help me answer your first two questions.

For the third, yes there is some overhead in performing the present look-up, but is fairly small.

- Mat
Back to top
View user's profile
appleluo



Joined: 21 Nov 2012
Posts: 19

PostPosted: Fri Jan 25, 2013 9:24 am    Post subject: Reply with quote

Hi Mat,

Here is an example.

========= Begin program ==========
module mod1
real*8, allocatable :: a(:,:), b(:,:), c(:,:)
end module mod1

program prog1
use mod1

allocate(a(100,100),b(100,100),c(100,100))
c=0.0d0
a=1.13240d0
b=2.33413d0

call sub1

end program prog1

subroutine sub1
use mod1
integer i,j,k
!$acc data copyin(a,b) copy(c)

!$acc kernels loop present(a, b, c)
do j=1,100
do i=1,100
do k=1,100
c(i,j) = c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo

!$acc end kernels

!$acc end data

end subroutine sub1
=========End of program===============
======== Begin compiler output ===========
pgfortran -acc -Minfo main.f90
prog1:
9, Memory zero idiom, array assignment replaced by call to pgf90_mzero8
10, Memory set idiom, array assignment replaced by call to pgf90_mset8
11, Memory set idiom, array assignment replaced by call to pgf90_mset8
sub1:
20, Generating copyin(b(:,:))
Generating copyin(a(:,:))
Generating copy(c(:,:))
22, Generating present_or_copy(c(:,:))
Generating present_or_copyin(b(:,:))
Generating present_or_copyin(a(:,:))
Generating compute capability 1.3 binary
Generating compute capability 2.0 binary
23, Loop is parallelizable
24, Loop is parallelizable
25, Complex loop carried dependence of 'c' prevents parallelization
Loop carried dependence of 'c' prevents parallelization
Loop carried backward dependence of 'c' prevents vectorization
Inner sequential loop scheduled on accelerator
Accelerator kernel generated
23, !$acc loop gang ! blockidx%y
24, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
25, CC 1.3 : 17 registers; 136 shared, 4 constant, 0 local memory bytes
CC 2.0 : 33 registers; 0 shared, 152 constant, 0 local memory bytes
==========End compiler output==================

If I delete present(a, b, c) from the parallel construct, the output from the compiler is as follow

sub1:
20, Generating copyin(b(:,:))
Generating copyin(a(:,:))
Generating copy(c(:,:))
22, Generating copy(c(:,:))
Generating copyin(a(:,:))
Generating copyin(b(:,:))
Generating compute capability 1.3 binary
Generating compute capability 2.0 binary


Thanks,

Ping
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Mon Jan 28, 2013 11:42 am    Post subject: Reply with quote

Hi Ping,

You must be using an older version of the compiler. The Minfo messages originally hadn't been updated to reflect the "present_or_copy.." change that occurred in the 12.6 release. This was corrected in the 12.9 release.

Here's the output from 12.8 and 12.9:
Code:

% pgf90 -acc -Minfo test2.f90 -V12.8
prog1:
      9, Memory zero idiom, array assignment replaced by call to pgf90_mzero8
     10, Memory set idiom, array assignment replaced by call to pgf90_mset8
     11, Memory set idiom, array assignment replaced by call to pgf90_mset8
sub1:
     20, Generating copyin(b(:,:))
         Generating copyin(a(:,:))
         Generating copy(c(:,:))
     22, Generating copy(c(:,:))
         Generating copyin(a(:,:))
         Generating copyin(b(:,:))
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     24, Loop is parallelizable
     25, Loop is parallelizable
     26, Complex loop carried dependence of 'c' prevents parallelization
         Loop carried dependence of 'c' prevents parallelization
         Loop carried backward dependence of 'c' prevents vectorization
         Inner sequential loop scheduled on accelerator
         Accelerator kernel generated
         24, !$acc loop gang ! blockidx%y
         25, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
         26, CC 1.3 : 17 registers; 128 shared, 4 constant, 0 local memory bytes
             CC 2.0 : 33 registers; 0 shared, 144 constant, 0 local memory bytes

p% pgf90 -acc -Minfo test2.f90 -V12.9
prog1:
      9, Memory zero idiom, array assignment replaced by call to pgf90_mzero8
     10, Memory set idiom, array assignment replaced by call to pgf90_mset8
     11, Memory set idiom, array assignment replaced by call to pgf90_mset8
sub1:
     20, Generating copyin(b(:,:))
         Generating copyin(a(:,:))
         Generating copy(c(:,:))
     22, Generating present_or_copy(c(:,:))
         Generating present_or_copyin(a(:,:))
         Generating present_or_copyin(b(:,:))
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     24, Loop is parallelizable
     25, Loop is parallelizable
     26, Complex loop carried dependence of 'c' prevents parallelization
         Loop carried dependence of 'c' prevents parallelization
         Loop carried backward dependence of 'c' prevents vectorization
         Inner sequential loop scheduled on accelerator
         Accelerator kernel generated
         24, !$acc loop gang ! blockidx%y
         25, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
         26, CC 1.3 : 17 registers; 112 shared, 4 constant, 0 local memory bytes
             CC 2.0 : 42 registers; 0 shared, 128 constant, 0 local memory bytes


Sorry for the confusion,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group