PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

profiler inconsistency?

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
zvanaz



Joined: 22 Feb 2007
Posts: 2

PostPosted: Fri Mar 09, 2007 6:35 pm    Post subject: profiler inconsistency? Reply with quote

Hi all,

I'm trying to profile a few subroutines within a fairly large f90 code, so I compiled and linked with -Mprof=func, and ran several tests. At the same time within the code I used etime and dtime to have time estimates for the same subroutines and I found two types of inconsistency:

a) for a subroutine which only reads in data, dtime tells me it took 61 s (user+sys), while the profiler, for the same run and subroutine says 359. The call to the subroutine is preceded and followed by dtime.

b) subroutine A has a few simple statements and small do while loop, then two nested do loops from which two subroutines (B and C) are called alternatively. at each pass.
The self times and costs cumulative and per call, of B and C are constants within accuracy, but the parent subroutine A has different self times during different tests: 246 s with 1 print statement, and 509 without!!! everything else being the same.
None of the mentioned subroutines makes system calls, or outputs anything with the exception of the one print statement.

these are all the options I'm using:
-Mvect=cachesize:524288 -Munroll -Mnoframe -O2 -pc 64 -Mfree -Mprof=func

any idea?

thanks a lot!
Giovanni
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue Mar 13, 2007 4:07 pm    Post subject: Reply with quote

Hi Giovanni,

I passed this on to our Tools Group but they're not sure. Can you please send an example and a description on how to recreate the problem?

Thanks,
Mat


Last edited by mkcolg on Sat Mar 17, 2007 8:03 am; edited 1 time in total
Back to top
View user's profile
zvanaz



Joined: 22 Feb 2007
Posts: 2

PostPosted: Thu Mar 15, 2007 6:21 pm    Post subject: Reply with quote

Hi Mat,

mkcolg wrote:
Hi Giovanni,

I passd this on to our Tools Group but they're not sure. Can you please send an example and a description on how to recreate the problem?


I looked into it, but sending you the whole thing with data and all you need to run would be about 1Gb, and taking that specific piece of code offline would require quite some work.

I was able to determine that part of the inconsistencies were due to hardware problems of the specific node I was using on a red hat linux cluster. Once I switched node results did improve a lot, but the problem b) changed slightly, even after I got rid of the print statements and compiled with no optimization at all:

subroutine A has two nested do loops, for each pass it executes alternatively subroutine B xor C, depending on a dynamic condition and on a second condition determined through a configuration file once per run. If conditions are such that C is never executed the self time of A is 0.5 s. If C is executed (~ 70% of the times), the self time of A is 44 s, and note that C is at least 10 times faster B. Also what is done in A, outside B and C is always the same.

I'm not sure what's going on,
Thanks!
Giovanni
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Sat Mar 17, 2007 8:24 am    Post subject: Reply with quote

Hi Giovanni,

A couple of things to note. Adding print statements can effect what optimizations can be performed and might account for some of the difference. Try various optimization levels, "-O1', "-O2", "-fast", and "-fastsse", to see how your code is effected.

Also, "-Mprof=func" instruments your code which is a fairly intrusive profiling method. You might consider using a hardware counter sampling method instead as described in Chapter 2 of the PGI Tool's Guide.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group