PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Task Parallelism using Accelerators directives

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Fedele.Stabile



Joined: 08 Feb 2012
Posts: 9

PostPosted: Mon Mar 26, 2012 3:51 am    Post subject: Task Parallelism using Accelerators directives Reply with quote

Hi,
I have two independent tasks in my code like in the example below.
How can I instruct the compiler to execute them in parallel and how can
force a synchronization at the end of the two tasks?
Example:

! sum of two matrix A and B, task 1
do i=1, 512
do j=1, 512
C(i,j) = A(i,j) + B(i,j)
enddo
enddo
!
! multiplication of two matrix A and B, task 2
do i = 1, 512
do j = 1, 512
D(i,j) = 0
enddo
enddo
do i = 1, 512
do j = 1, 512
do k = 1, 512
D(i,j) = D(i,j) + A(i,k)*B(k,j)
enddo
enddo
enddo
! Need syncronization
! sum of two matrix D and C results of preceding tasks
do i=1, 512
do j=1, 512
E(i,j) = D(i,j) + C(i,j)
enddo
enddo
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Mon Mar 26, 2012 11:46 am    Post subject: Reply with quote

Hi Fedele.Stabile,

Use the "async" clause and then add a "wait" directive to synchronise.

Something along the lines of:
Code:

integer(4) :: handle

handle = 1

!$acc data region copyin(A,B), copyout(E), local(C,D)

! sum of two matrix A and B, task 1
!$acc region async(handle)
do i=1, 512
do j=1, 512
C(i,j) = A(i,j) + B(i,j)
enddo
enddo
!$acc end region
!
! multiplication of two matrix A and B, task 2
!$acc region async(handle)
do i = 1, 512
do j = 1, 512
D(i,j) = 0
enddo
enddo
do i = 1, 512
do j = 1, 512
do k = 1, 512
D(i,j) = D(i,j) + A(i,k)*B(k,j)
enddo
enddo
enddo
!$acc end region

!$acc wait(handle)
! Need syncronization
! sum of two matrix D and C results of preceding tasks
!$acc region
do i=1, 512
do j=1, 512
E(i,j) = D(i,j) + C(i,j)
enddo
enddo
!$acc end region
!$acc end data region


Note that the use of a "handle" is optional.

- Mat
Back to top
View user's profile
KarlW



Joined: 12 Jan 2009
Posts: 23

PostPosted: Wed May 02, 2012 4:01 am    Post subject: Reply with quote

Hi Mat,

Would it be possible for you to add an asynchronous data transfer into this example?

I have tried the following code but the CUDA_PROFILE output indicates that the transfer of F occurs on the same streamid as the transfers of the other arrays rather than the same stream as the kernel executions.

The method used for the data transfer also appear to be the blocking "memcpyHtoD" rather than the asynchronous version I would have anticipated.

BTW there seems to be an issue when compiling Accelerator code that uses the async clause and the -Mcuda flag. The profiling output indicates that all kernels and transfers are executed on streamid 0 in this case.

Thanks in advance,

Karl

Code:
program asynctest
integer, dimension(:,:) :: A(512,512)
integer, dimension(:,:) :: B(512,512)
integer, dimension(:,:) :: C(512,512)
integer, dimension(:,:) :: D(512,512)
integer, dimension(:,:) :: E(512,512)
integer, dimension(:,:) :: F(512,512)
!$acc mirror(F)
integer(4) :: handle

handle = 1


!$acc data region copyin(A,B), copyout(E), local(C,D)

!$acc update device(F) !$acc async(handle)

! sum of two matrix A and B, task 1
!$acc region async(handle)
!$acc do
do i=1, 512
   do j=1, 512
      C(i,j) = A(i,j) + B(i,j)
   enddo
enddo
!$acc end region

! multiplication of two matrix A and B, task 2
!$acc region async(handle)
!$acc do
do i = 1, 512
   do j = 1, 512
      D(i,j) = 0
   enddo
enddo
do i = 1, 512
   do j = 1, 512
      do k = 1, 512
         D(i,j) = D(i,j) + A(i,k)*B(k,j)
      enddo
   enddo
enddo
!$acc end region

!$acc wait(handle)
! Need syncronization

! sum of two matrix D and C (results of preceding tasks)
!$acc region
!$acc do
do i=1, 512
   do j=1, 512
      E(i,j) = D(i,j) * C(i,j)
   enddo
enddo
!$acc end region
!$acc end data region
end program asynctest
[/code]
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group