PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

PGI Accelerator programming concepts questions
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 6210
Location: The Portland Group Inc.

PostPosted: Mon Nov 29, 2010 3:48 pm    Post subject: Reply with quote

Hi A,

Here's simple example of using the reflected directive:

Code:
% cat refected.f90

module mm
contains
 subroutine sub1( a, b, c )
  implicit none
  real :: a(:,:), b(:,:), c(:,:)
  !$acc reflected(a)
  integer :: i,j
  !$acc region
   do j = 1,ubound(a,2)
    do i = 1,ubound(a,1)
     a(i,j) = b(i,j) + c(i,j)
    enddo
   enddo
  !$acc end region
 end subroutine
end module

program p
 use mm
 use accel_lib
 implicit none
 integer, parameter :: n=32,m=32
 real :: a(n,m), b(n,m), c(n,m)
 integer :: i,j
 do j = 1,m
  do i = 1,n
   a(i,j) = -1.0
   b(i,j) = (j*100) + i
   c(i,j) = -(j*100) + i
  enddo
 enddo

 !$acc data region copyout(a)
  call sub1(a,b,c)
 !$acc end data region

  print *, a(1,1), a(n,m)
  print *, b(1,1), b(n,m)
  print *, c(1,1), c(n,m)
end program
% pgf90 -ta=nvidia -Minfo=accel refected.f90 -V11.0 ; a.out
sub1:
      7, Generating local(a(:,:))
      9, Generating copyin(b(1:z_b_0,1:z_b_3))
         Generating copyin(c(1:z_b_0,1:z_b_3))
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     10, Loop is parallelizable
     11, Loop is parallelizable
         Accelerator kernel generated
         10, !$acc do parallel, vector(16) ! blockidx%y threadidx%y
         11, !$acc do parallel, vector(16) ! blockidx%x threadidx%x
             CC 1.0 : 7 registers; 64 shared, 8 constant, 0 local memory bytes; 100% occupancy
             CC 1.3 : 8 registers; 64 shared, 8 constant, 0 local memory bytes; 100% occupancy
             CC 2.0 : 15 registers; 8 shared, 72 constant, 0 local memory bytes; 100% occupancy
p:
     34, Generating copyout(a(:,:))
    2.000000        64.00000   
    101.0000        3232.000   
   -99.00000       -3168.000   


Quote:
Yet in December? It is already documented in the Programming Model Guide :)
The Model is ahead of implementation though in 11.0 we will have the full PGI 1.2 Accelerator Model fully implemented. Still more work to do since the spec for 1.3 Model just cam out as well.

Quote:
Any ETA when exactly in December will You release the 11th compiler suite?
Right now we're finishing up a few last minute fixes. Barring any show stopping errors, we're expecting 11.0 to be available mid month.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3
Page 3 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group