PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Triply nested loop using implicit OpenACC

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
_sayan_



Joined: 07 Apr 2012
Posts: 29

PostPosted: Wed Sep 05, 2012 10:34 am    Post subject: Triply nested loop using implicit OpenACC Reply with quote

Greetings,

I would like to know how the PGI compiler handles nested loops in an implicit model. My code:
Code:

!$ACC KERNELS          &
!$ACC PRESENT(p0,p1)               
!$ACC LOOP INDEPENDENT
do k=k0,k1
 !$ACC LOOP INDEPENDENT
 do j=j0,j1
 !$ACC LOOP INDEPENDENT
  do i=i0,i1


I am assuming that the outer two loops would be distributed to Y/X blocks and would the innermost loop be vectorized in this case?

Thank you,
Sayan
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6207
Location: The Portland Group Inc.

PostPosted: Wed Sep 05, 2012 10:57 am    Post subject: Reply with quote

Hi Sayan,

The most likely schedule is a 2-D block (gang) using a strip mined k and j loops, and a 3-D thread block (vector) from the k, j, and i loops. Though, this is highly dependent upon what the body of the loop looks like and how the data is accessed.

Hope this helps,
Mat
Back to top
View user's profile
_sayan_



Joined: 07 Apr 2012
Posts: 29

PostPosted: Wed Sep 05, 2012 12:09 pm    Post subject: Reply with quote

Thanks Mat - is it possible to comment in general if this is a good way to use OpenACC (in terms of performance)? Actually we observe different performance when we run this code block against different compilers. So I wanted to ask if I should explicitly use gangs and vector clauses in order to tune my code.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6207
Location: The Portland Group Inc.

PostPosted: Wed Sep 05, 2012 12:58 pm    Post subject: Reply with quote

Quote:
So I wanted to ask if I should explicitly use gangs and vector clauses in order to tune my code.
Personally, I don't find explicit schedule tuning to help much. I find the PGI compiler finds a good one in the vast majority of cases and I'd rather not tie my program to a particular schedule since it may not be optimal for other devices.

However, since your tuning for the compiler not the device, it may be worth it to you to set the schedule yourself. Granted, there's more to performance than the schedule, so fixing the schedule may still yield varying performance. Worth a try though.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group