PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

does "acc loop seq" work

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Alexey A. Romanenko



Joined: 17 Feb 2012
Posts: 36

PostPosted: Mon Oct 01, 2012 8:53 pm    Post subject: does "acc loop seq" work Reply with quote

Hi!

I have nested loops. I marked inner loop with "!$acc loop seq", but compiler's output looks like compiler ignore this directive. I tried to write a small program to reproduce another bug I faced with.

Code:
       program test

       integer, parameter :: SZ = 12800000

       integer :: i,j,k
       real*8  :: d(SZ)
       real*8  :: tmp(128)

       d(:) = 0.0
!$acc kernels loop private(tmp) independent
       do i=0,((SZ/128)-1)
!$acc loop seq
          do j=1,128
             tmp(j) = 1.0/j
          enddo
!$acc loop seq
          do j=1,128
             d(i*128+j) = tmp(j)*3.1415
          enddo
       enddo
!$acc end kernels

       print *, 'sum = ', sum(d)

       end program


Output:
Code:
 pgi$ pgfortran -acc -Minfo test.f90
test:
     13, Generating present_or_copy(d(:))
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     14, Loop is parallelizable
     16, Loop is parallelizable
         Accelerator kernel generated
         14, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
         16, CC 1.3 : 15 registers; 20 shared, 48 constant, 0 local memory bytes
             CC 2.0 : 17 registers; 0 shared, 48 constant, 0 local memory bytes
     20, Loop is parallelizable
         Accelerator kernel generated
         14, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
         20, CC 1.3 : 10 registers; 24 shared, 4 constant, 0 local memory bytes
             CC 2.0 : 17 registers; 0 shared, 44 constant, 0 local memory bytes
     26, sum reduction inlined


Result is correct.

Any ideas? Should I use "acc parallel loop" instead of "kernels"?

Alexey
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6211
Location: The Portland Group Inc.

PostPosted: Tue Oct 02, 2012 10:02 am    Post subject: Reply with quote

Hi Alexey,

What the compiler has done here is break apart your loops into two separate kernels looking something like:
Code:

!$acc kernels loop
       do i=0,((SZ/128)-1)
!$acc loop seq
          do j=1,128
             tmp(i,j) = 1.0/j
          enddo
      end do

!$acc kernels loop
       do i=0,((SZ/128)-1)
!$acc loop seq
          do j=1,128
             d(i*128+j) = tmp(i,j)*3.1415
          enddo
       enddo

So the "seq" is being preserved, it's just that there are now two different kernels created from the outer loop.
Quote:
Should I use "acc parallel loop" instead of "kernels"?

If you want to force the use of a single kernel, then yes, you can use "parallel" here.
Code:
% cat f10_2_12.f90
       program test

       integer, parameter :: SZ = 12800000

       integer :: i,j,k
       real*8  :: d(SZ)
       real*8  :: tmp(128)

       d(:) = 0.0
!$acc parallel loop private(tmp)
       do i=0,((SZ/128)-1)
!$acc loop seq
          do j=1,128
             tmp(j) = 1.0/j
          enddo
!$acc loop seq
          do j=1,128
             d(i*128+j) = tmp(j)*3.1415
          enddo
       enddo

       print *, 'sum = ', sum(d)

       end program
% pgf90 -acc -Minfo=accel f10_2_12.f90 -V12.9
test:
     10, Accelerator kernel generated
         10, CC 1.3 : 17 registers; 32 shared, 48 constant, 0 local memory bytes
             CC 2.0 : 22 registers; 0 shared, 56 constant, 0 local memory bytes
         11, !$acc loop gang, vector(256) ! blockidx%x threadidx%x
     10, Generating present_or_copy(d(:))
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     13, Loop is parallelizable
     17, Loop is parallelizable


Hope this helps,
Mat
Back to top
View user's profile
Alexey A. Romanenko



Joined: 17 Feb 2012
Posts: 36

PostPosted: Tue Oct 02, 2012 10:58 pm    Post subject: Reply with quote

Thank you Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group