PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

how to avoid this dependency
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Fri Nov 02, 2012 6:40 am    Post subject: how to avoid this dependency Reply with quote

!!acc kernels do private(i) !not allowed due to loop depenency
! do i = 1, nn
! sz(i) = D(i)*( s(i) - A(i,1) * sz(i+JA1) &
! - A(i,2) * sz(i+JA2) &
! - A(i,3) * sz(i+JA3) )
! end do
!!acc end kernels

the code above is not allowed to be accelarated because of its depenency of sz.

I tried to solve that by praparing another sz called Fsz like this
double precision,dimension(n1:n2) :: sz
double precision,dimension(n1:n2) :: Fsz
...
do i = n1, n2
Fsz(i) = sz(i)
end do
and change the code as below
!$acc kernels do private(i)
do i = 1, nn
sz(i) = D(i)*( s(i) - A(i,1) * Fsz(i+JA1) &
- A(i,2) * Fsz(i+JA2) &
- A(i,3) * Fsz(i+JA3) )
end do
!$acc end kernels

Was I wrong with my idea? Anyway I just cannot get a right result.
(I had already sent the code to Mat and got very useful advices)
Back to top
View user's profile
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Fri Nov 02, 2012 6:14 pm    Post subject: I made a mistake there Reply with quote

It seems like I made a mistake there.
For the code below,
do i = 1, nn
sz(i) = D(i)*( s(i) - A(i,1) * sz(i+JA1) &
- A(i,2) * sz(i+JA2) &
- A(i,3) * sz(i+JA3) )
end do

sz will use the latest sz among the do loop. So actually it won`t equal to
do i = 1, nn
sz(i) = D(i)*( s(i) - A(i,1) * Fsz(i+JA1) &
- A(i,2) * Fsz(i+JA2) &
- A(i,3) * Fsz(i+JA3) )
end do

Then I was wrong. But isn`t there any solution?
Back to top
View user's profile
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Fri Nov 02, 2012 10:46 pm    Post subject: I got it Reply with quote

Do I =2, n Do I =2, n
X(i) = X(i) + X(i-1) X(i) = X(i) + X(i+1)Enddo Enddo
were different.
I was supposed to give up since the sy(i+JA2)s` JA2 was -1. If it is -10 or any number smaller, I may get a solution.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Mon Nov 05, 2012 4:17 pm    Post subject: Reply with quote

Hi Kevin,

You're strategy of using Fsz is correct for forward dependencies since the value of sz(i+N) is fixed relative to the value of sz(i). However you're JA's are negative resulting in a backwards dependency so there's not much that can be done except run the loop sequentially.

- Mat
Back to top
View user's profile
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Thu Nov 08, 2012 7:35 am    Post subject: disappointed Reply with quote

Hi, everyone.
To get rid of the data dependency above, I had tried to use the Multigrid method instead of ILU. But I still met problems which couldn`t solve right now.
Part of the code is like this
!$acc kernels
do i = 1, NI*NJ
r(i) = c0
r0(i) = c0
p(i) = c0
yy(i) = c0
e(i) = c0
v(i) = c0
end do
c
c do i = 1, nn
c r(i) = B(i) - A(i,1) * X(i+JA1) - A(i,2) * X(i+JA2) &
c - A(i,3) * X(i) &
c - A(i,4) * X(i+JA4) - A(i,5) * X(i+JA5)
c end do
DO I=2,NIM
II=(I-1)*NJ+IJGR(L)
!$acc do private(r)
DO IJ=II+2,II+NJM
r(IJ)=QA(IJ)-APA(IJ)*FIA(IJ)-AEA(IJ)*FIA(IJ+NJ)-
* AWA(IJ)*FIA(IJ-NJ)-ASA(IJ)*FIA(IJ-1)-ANA(IJ)*FIA(IJ+1)
END DO
END DO
c
c c1 = c0
c do i = 1, nn
c c1 = c1 + r(i) * r(i)
c end do

DO I=2,NIM
II=(I-1)*NJ+IJGR(L)
DO IJ=II+2,II+NJM
c1 = c1 + r(IJ) * r(IJ)
END DO
END DO

c bb = c0
c do i = 1, nn
c bb = bb + B(i) * B(i)
c end do
DO I=2,NIM
II=(I-1)*NJ+IJGR(L)
DO IJ=II+2,II+NJM
bb = bb + QA(IJ) * QA(IJ)
END DO
END DO

DO I=2,NIM
II=(I-1)*NJ+IJGR(L)
!$acc do private(p, r0)
DO IJ=II+2,II+NJM
p(IJ) = r(IJ)
r0(IJ) = r(IJ)
END DO
END DO
!$acc end kernels

and the compiler message seems no abnormality even though I don`t know why I was supposed to add do private somewhere or don`t elsewhere.

bicgstabmg:
2085, Generating present_or_copyin(ijgr(l))
Generating present_or_copyin(ana(:))
Generating present_or_copyin(asa(:))
Generating present_or_copyin(awa(:))
Generating present_or_copyin(aea(:))
Generating present_or_copyin(apa(:))
Generating present_or_copyin(fia(:))
Generating present_or_copyin(qa(:))
Generating copyin(r(:))
Generating copyout(r(:ni*nj))
Generating present_or_copyout(r0(:ni*nj))
Generating present_or_copyout(p(:ni*nj))
Generating present_or_copyout(yy(:ni*nj))
Generating present_or_copyout(e(:ni*nj))
Generating present_or_copyout(v(:ni*nj))
Generating compute capability 2.0 binary
2086, Loop is parallelizable
Accelerator kernel generated
2086, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
CC 2.0 : 12 registers; 0 shared, 108 constant, 0 local memory bytes
2100, Loop is parallelizable
Accelerator kernel generated
2100, !$acc loop gang ! blockidx%x
CC 2.0 : 24 registers; 16 shared, 144 constant, 0 local memory bytes
2103, !$acc loop vector(128) ! threadidx%x
Loop is parallelizable
2114, Loop is parallelizable
Accelerator kernel generated
2114, !$acc loop gang ! blockidx%x
CC 2.0 : 16 registers; 16 shared, 92 constant, 0 local memory bytes
2116, !$acc loop vector(128) ! threadidx%x
2117, Sum reduction generated for c1
2116, Loop is parallelizable
2125, Loop is parallelizable
Accelerator kernel generated
2125, !$acc loop gang ! blockidx%x
CC 2.0 : 16 registers; 16 shared, 92 constant, 0 local memory bytes
2127, !$acc loop vector(128) ! threadidx%x
2128, Sum reduction generated for bb
2127, Loop is parallelizable
2132, Loop is parallelizable
Accelerator kernel generated
2132, !$acc loop gang ! blockidx%x
CC 2.0 : 18 registers; 16 shared, 108 constant, 0 local memory bytes
2135, !$acc loop vector(128) ! threadidx%x
Loop is parallelizable

I am ashamed to admitted that after a lot of practice I still am an amateur
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group