PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

compiler generating a "scenic route"

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
alanl



Joined: 19 Apr 2012
Posts: 7

PostPosted: Tue May 08, 2012 3:37 pm    Post subject: compiler generating a "scenic route" Reply with quote

I have been doing some basic testing with the 12.4 pgfortran compiler on a Windows7 64-bit box (compile options = -Mextend -Mr8 -O2 -fast), including some comparisons with gfortran. For a matrix multiply comparison, pgi was a bit over 5 times faster, a decent result. However for another case, the result was vastly worse.

This code is down to its minimalist configuration. The kindest description would be to call the result "pathetic", as it was 15 times slower than gfortran. And if you wrap it in pgcollect, it takes another 30 times longer to run, being 460 times slower than gfortran. I have no idea what the compiler is doing, but it is certainly taking the long road to get there.

I did some testing and have determined that the main source of the problem is the reshape function. pgi appears to totally get lost in there somewhere, though when it finally comes out, the answer is correct. A minor factor is the use of temporary files, which appears to add about a factor of 2 to the run. (Note that this test has tiny arrays and a good chunk of the time is setup time.)

So until and unless this slow performance is resolved, there are 3 things to remember. (1) do NOT use the reshape function. (2) try to avoid temporary arrays. (3) if you ignore item 1, do not ever even think about using pgcollect.

Do you have any thoughts as to why this poor performance might be occuring? I know how to avoid these particular items, but one always wonders if the cause of this issue is lurking elsewhere as well.

Thanks.

-alan
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue May 08, 2012 4:42 pm    Post subject: Reply with quote

Hi Alan,

This is a known performance issue that we've been working on correcting for some time (TPR#18097). As you note, the issue is that we will always create a temp array to store the results of RESHAPE. This is correct behavior, however, the performance can be quite poor especially if the arrays are small and the RESHAPE is called many times.

In 12.5, we will add a new optimization which will eliminate the need for a temp array when the source and shape are present and the source is contiguous. The caveat is that this optimization does not cover all cases, such as if order or pad are used, so may or may not help the performance of your code as well. If you can, please send us a reproducing example (trs@pgroup.com) so we can either confirm that this fixes your issue or add a new problem report.

As for pgcollect, I'm not sure what's happening there. It's non-intrusive so shouldn't impact the overall performance of the code. Having a reproducing example would help here as well.

Best Regards,
Mat
Back to top
View user's profile
alanl



Joined: 19 Apr 2012
Posts: 7

PostPosted: Wed May 09, 2012 7:39 am    Post subject: Reply with quote

Mat,

OK, it looks like my "simple" test case has evoked the absolute worst combination for performance under your compiler: running arrays of 2x2 and 2x1 with a very large outer iteration loop. Well, test cases are supposed to be stressing.....

I will ship you a copy of the stripped down code using self contained data.

Thanks for the info.

-alan
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Thu May 10, 2012 8:47 am    Post subject: Reply with quote

Hi Alan,

Yes, your code exhibits this pathological case. I tested with our pre-release version of 12.5 and show that your test is now over 2x faster with compiled with PGI 12,5 over gfortran (.92 secs versus 2.5 seconds) and over 40x faster than PGI 12.4 (42 seconds)

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group