PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

grouping specific loops into a kernel

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Minh Duc Nguyen



Joined: 15 Apr 2010
Posts: 6

PostPosted: Tue May 07, 2013 11:06 am    Post subject: grouping specific loops into a kernel Reply with quote

Hello,

I want to map the outer most 2 loops (k and j) into 1 single kernel, k loop and j loop will be executed in vector mode, and each thread in this kernel will execute the i loop in sequential manner.

How can I force the PGI compiler to do that using OpenACC directives?

Code:


      program testbla
      use openacc
        integer :: i, j, k, n
        real, allocatable, dimension(:,:,:) :: a, b, c
      n = 10
      allocate(a(n,n,n), b(n,n,n), c(n,n,n))
      do k = 1, n
        do j = 1, n
          do i = 1, n
            a(i,j,k) = 0.0
            b(i,j,k) = 1.0
          enddo
        enddo
      enddo
      do k = 1, n
        do j = 1, n
          do i = 1, n
            c(i,j,k) =  a(i,j,k) + b(i,j,k)
          enddo
          do i = 1, n
            c(i,j,k) =  a(i,j,k) + b(i,j,k)
          enddo
        enddo
      enddo
      !print *, c
      end program testbla


Thank you.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 4996
Location: The Portland Group Inc.

PostPosted: Tue May 07, 2013 11:43 am    Post subject: Reply with quote

A couple of ways. The "collapse(2) gang vector" clause when used with the "kernels" construct will merge the "k" and "j" loops into a 2-D gang and 2-D vector schedule. You would then add a "loop seq" clause around the "i" loops to force them to be run sequentially.

- Mat

Code:
% cat test3.f90

      program testbla
      use openacc
        integer :: i, j, k, n
        real, allocatable, dimension(:,:,:) :: a, b, c
      n = 10
      allocate(a(n,n,n), b(n,n,n), c(n,n,n))

!$acc kernels loop collapse(2) gang vector
      do k = 1, n
        do j = 1, n
!$acc loop seq
          do i = 1, n
            a(i,j,k) = 0.0
            b(i,j,k) = 1.0
          enddo
        enddo
      enddo

!$acc parallel loop gang
      do k = 1, n
!$acc loop vector
        do j = 1, n
!$acc loop seq
          do i = 1, n
            c(i,j,k) =  a(i,j,k) + b(i,j,k)
          enddo
!$acc loop seq
          do i = 1, n
            c(i,j,k) =  a(i,j,k) + b(i,j,k)
          enddo
        enddo
      enddo

      ! need to print this out otherwise dead-code
      ! elemination will remove the above loops
      print *, c(1,1,1), c(n,n,n)
      end program testbla
% pgf90 -acc -Minfo=accel test3.f90 -V13.5
testbla:
     10, Generating present_or_copyout(b(1:10,1:10,1:10))
         Generating present_or_copyout(a(1:10,1:10,1:10))
         Generating NVIDIA code
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary
         Generating compute capability 3.0 binary
     11, Loop is parallelizable
     12, Loop is parallelizable
     14, Loop is parallelizable
         Accelerator kernel generated
         11, !$acc loop gang, vector(4) ! blockidx%y threadidx%y
         12, !$acc loop gang, vector(32) ! blockidx%x threadidx%x
     21, Accelerator kernel generated
         22, !$acc loop gang ! blockidx%x
         24, !$acc loop vector(256) ! threadidx%x
     21, Generating present_or_copyin(b(1:10,1:10,1:10))
         Generating present_or_copyin(a(1:10,1:10,1:10))
         Generating present_or_copyout(c(1:10,1:10,1:10))
         Generating NVIDIA code
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary
         Generating compute capability 3.0 binary
     24, Loop is parallelizable
     26, Loop is parallelizable
     30, Loop is parallelizable
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group