PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

High-performace linpack benchmark
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 6206
Location: The Portland Group Inc.

PostPosted: Wed Jun 08, 2005 11:15 am    Post subject: Reply with quote

Hi Peter,

Since I haven't specifically looked at HPL, I don't know which flags give the best performance. For 64-bit systems in general, the aggregate flag '-fastsse' gives very good performance. '-fastsse' is roughly equivlent to '-O2 -Munroll=c:2 -Mlre -Mnoframe -Mscalarsse -Mvect=sse -Mcache_align -Mflushz'. If you have the time to experiment, some other options to try are:

-fastsse -Minline=levels:2 <- adjust the levels as needed
-fastsse -Mipa=fast,inline,safe
-fastsse -Munroll=n:4 <- adjust the number of time to unroll as needed
-fastsse -O3
-fastsse -Mipa=fast,libinline,libopt <- may not work since the libraries were not compiled with IPA
-fastsse -Mipa=fast,safe

Also try mixing and matching the options. For example if both "-O3' and IPA inlining help, try '-fastsse -O3 -Mipa=fast,inline'

You can also experiment with -Mprefetch but prefetching generally only helps memory bounded codes which I don't believe this code is.

Let us know what you find out,
Mat
Back to top
View user's profile
peterp



Joined: 16 Apr 2005
Posts: 11

PostPosted: Thu Jun 09, 2005 7:46 am    Post subject: Reply with quote

Hi there,

Here's a typical result on the linpack benchmark with
HPL_OPTS = -fastsse -Minline=saxpy,sscal -Minfo -lpthread
and
CCNOOPT = -O0 -Kieee
on 81 Opteron 2.4 GHz processors:

T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
WR01L3R2 30000 80 9 9 85.68 2.101e+02
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0094769 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0151869 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0030144 ...... PASSED
============================================================================

Thanks for the help.

Best regards,
Peter
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group