PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

N-body problem nested loop with OpenAcc
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
paokara



Joined: 06 Feb 2011
Posts: 24

PostPosted: Mon Mar 25, 2013 11:58 am    Post subject: Reply with quote

Hi Mat and thank you

I did a couple of experiments last week.

1)First, we solve the NaN problem by putting the IF statement out of our parallel region.

Code:

if(flag) then
!$acc parallel
...
!$acc end parallel
endif



Why this was happenning? because of my "FLAG"? Every thread has its own copy of FLAG variable?

2)I have 2 adjacent loops inside this parallel region and in the first loop i use the reduction clause and i need the results in the second loop.

Code:

msys = 0
tmpx = 0
tmpy = 0
tmpz = 0

!$acc loop vector reduction(+:msys,tmpx,tmpy,tmpz)
do i=1,N
msys = msys + m(i)
tmpx = tempx +vx(i)
...
enddo


!$acc loop gang vector
do i=1,N
vxb(i) = vx(i) + tmpx/msys
vyb(i) = vy(i) + tmpy/msys
...
enddo



First of all, why do i need VECTOR clause in the first loop? (i get wrong results with GANG VECTOR)Because of the reduction?
Second,is it possible to take different results in two different execution of my program? Because in your article you say that there is a barrier at the end of the parallel region, not at the end of the first loop. So i believe that a random thread has not the correct values for the calculations in the second loop(for example: not correct value of the msys variable).

3)When i change my parallel region into kernel region i have another problem.From Nvidia Visual profiler i can see that there is a communication between host and device when my program reaches that region and i can't figure out the reason(is it because of the reduction?).I have a copy from host to device and then after the loop back to the host.With the parallel construct i don't see that communication and i have better time results. why is that happening?

Thank you for your help,
Sotiris
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Tue Mar 26, 2013 1:58 pm    Post subject: Reply with quote

Quote:
Why this was happenning? because of my "FLAG"? Every thread has its own copy of FLAG variable?
I would need a reproducing example to tell why. Each thread would get it's own copy of flag1, but they should all be initialized to the same value. Most likely something else is the cause and unrelated to the if statement itself, but I can't tell what that is from what you have posted.
Quote:
First of all, why do i need VECTOR clause in the first loop? (i get wrong results with GANG VECTOR)Because of the reduction?

Is this a typo in your post or a typo in your program?
Quote:
tmpx = tempx +vx(i)

If this is directly from your program, then this could be source of your issues. "tempx" may not be initialized. "tmpx" would need a last value causing the loop to not be paralleizable and is probably why you need to use a "vector" clause to force parallization.

Quote:
3)When i change my parallel region into kernel region i have another problem.From Nvidia Visual profiler i can see that there is a communication between host and device when my program reaches that region and i can't figure out the reason(is it because of the reduction?).I have a copy from host to device and then after the loop back to the host.With the parallel construct i don't see that communication and i have better time results. why is that happening?
It could be the result of the reduction since it needs to be passed between the kernels.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3
Page 3 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group