PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

how to avoid this dependency
Goto page Previous  1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 6128
Location: The Portland Group Inc.

PostPosted: Thu Nov 08, 2012 2:08 pm    Post subject: Reply with quote

Hi Kevin,

You need to be careful when using the "private" clause. When you privatize an object, it's lifetime is the same as the kernel in which it was was used. Hence, by putting "r", "r0" and "p" in a private clause, the values are thrown away at the end of the loop. This is what's causing you're wrong answers and hence, you need to remove these clauses.

Also, only rectangular loops can be accelerated, hence the inner IJ loops wont accelerate. The outer I loops should be ok, but you may need to add an "!$acc loop independent" clause around it. Due to the use of a compiled inner loop bounds, the compiler can't tell that array updates in the IJ loop don't overlap across each iteration of I.

Hope this helps,
Mat
Back to top
View user's profile
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Thu Nov 08, 2012 6:05 pm    Post subject: Thank you, but... Reply with quote

Hi, Mat

Thank you very much. That really helps me.

While the parts below
!$acc loop independent
DO I=2,NIM
II=(I-1)*NJ+IJGR(L)
DO IJ=II+2,II+NJM
p(IJ) = r(IJ)
r0(IJ) = r(IJ)
END DO
END DO
still told me that
Loop carried reuse of 'p' prevents parallelization
Loop carried reuse of 'r0' prevents parallelization

Anyidea to solve it?
Back to top
View user's profile
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Fri Nov 09, 2012 8:03 am    Post subject: a little improvement Reply with quote

I have found something interesting.
The code that I questioned before is
DO I=2,NIM
II=(I-1)*NJ+IJGR(L)
DO IJ=II+2,II+NJM
p(IJ) = r(IJ)
r0(IJ) = r(IJ)
END DO
END DO

while if I change the DoLoops to 1. It won`t tell me wrong and work correctly. And no bother to add "loop independent"
To be clear, I just change it to
Do IJ=NJ+IJGR(L)+2, (NIM-1)*NJ+IJGR(L)+NJM
p(IJ) = r(IJ)
r0(IJ) = r(IJ)
END DO
So, I`d admitted it my fault to write it in tha way and compiler is not the God who knows anything.

Well, the interesting is not over. For all of my code rest, the type just like
A(IJ) = A(IJ) +somethingelse
or
A(IJ) = B(IJ) +somethingelse
would tell me wrong while the type below without anyproblem
A = A +somethingelse
or
A = B +somethingelse
I think it is fun but just don`t know why.

I would like to change the topic now.
I am a little bit puzzled about how big the area we are supposed to use the !$acc data clause. The wider the better? Or in other words, we should to keep some data stay in the Gpu as small as a constant(not an array) as long as it was used frequently?
Any tell would be appreciated.
Back to top
View user's profile
KevinWoo



Joined: 08 Aug 2012
Posts: 19

PostPosted: Mon Nov 12, 2012 8:58 am    Post subject: well Reply with quote

While I got a success on the change from 2 doloops to 1, I still could not rely on my luck to solve the whole. Because I met this doloops
NIC=NIGR(L)
NJC=NJGR(L)
DO IC=2,NIC-1
IIC=(IC-1)*NJC+IJGR(L)
IF=2*IC-2
IIF=(IF-1)*NJF+IJGR(L+1)
DO JC=2,NJC-1
IJC=IIC+JC
JF=2*JC-2
IJF=IIF+JF
QA(IJC)=RES(IJF)+RES(IJF+1)+RES(IJF+NJF)+RES(IJF+NJF+1)
FIA(IJC)=0.
END DO
END DO
I have tried a lot to escape the strange talked above but still could not get a version that the compiler wont tell wrong. Of course,if I just let it be and add the directives it wont too. Any idea about that?
Still, I was disappointed of the time cost. I wish the data would stay on the Gpu or else it would cost so much. It is not a good idea to mix the Cpu codes with Gpus up,isnt it. But how would it be when you dont know to write ACCs to the all.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6128
Location: The Portland Group Inc.

PostPosted: Wed Nov 14, 2012 8:43 am    Post subject: Reply with quote

Hi Kevin,

I'm attending SC12 right now and this question deserves more then the few minutes I have right now. Once I'm back in the office, I'll take a closer look here and see what we can do.

Please send me the updated version of your code since this will help me understand where your at.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3  Next
Page 2 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group