PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

unrolling or data dependent loops
Goto page Previous  1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
deeppow



Joined: 02 Feb 2012
Posts: 51

PostPosted: Wed Mar 06, 2013 1:51 pm    Post subject: Reply with quote

Matt,

"-Munrol=n:4" gave me an ~15% speed up on my test problem, from ~10min to ~8.5min. There was more than just the one case I noted.

This situation arises due to my use of indirect addressing (what you call table lookup) for array indexes for a large number of cases (mesh/grid cells).


-ralph
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Fri Mar 08, 2013 4:16 pm    Post subject: Reply with quote

Any benefit if you increase the unroll factor even further? -Munroll=n:8 or even -Munroll=n:64?

- Mat
Back to top
View user's profile
deeppow



Joined: 02 Feb 2012
Posts: 51

PostPosted: Sun Mar 10, 2013 5:04 pm    Post subject: Reply with quote

No improvement. I tried 8. I did look at the task manager cpu loading with 4 and it was at or near 100% most of the time so I figured it wasn't going to help much.

I understand what the unrolling does to the do-loop but I'm not quite sure how it uses the cpu architecture. Would assume the unrolled loops are shipped off to different cores. How is this different than parallelization regarding a cpu?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Mon Mar 11, 2013 9:09 am    Post subject: Reply with quote

Quote:
No improvement. I tried 8.
Too bad, but worth a try.

Quote:
Would assume the unrolled loops are shipped off to different cores. How is this different than parallelization regarding a cpu?
No, unrolling does not auto-parallelize, the code is executed sequentially.

- Mat
Back to top
View user's profile
deeppow



Joined: 02 Feb 2012
Posts: 51

PostPosted: Mon Mar 11, 2013 10:09 am    Post subject: Reply with quote

Matt,
I must show my ignorance here. I would interpret sequential to mean on only 1 core, i.e. just like an old single-core scaler processor. If that is true of what advantage is unrolling? Would seem to even add a little loop overhead. Since that makes no sense, I conclude I have something wrong in my thinking.

-ralph
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page Previous  1, 2, 3  Next
Page 2 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group