|
| View previous topic :: View next topic |
| Author |
Message |
kolya
Joined: 25 May 2005 Posts: 2
|
Posted: Fri Sep 15, 2006 4:25 pm Post subject: gprof vs pgprof |
|
|
Hello All,
I've got a bit confused on a very trivial thing, probably somewhere about a kindergarten level. Maybe somebody would be so kind to help me out?
I'm trying to profile some code with pgprof first, and then with gprof, and the results seem to be quite different. Specifically, if I compile it with
-Mprof=func switch, the pgprof profile comes out to be
Function
Calls Time(%) Cost(%) Name:
---------------------------------------------------------------
66022803 38.12 68 greenf (gre_sep5.f:112)
276447237 24.97 25 ex (gre_sep5.f:326)
198068409 11.45 11 lint (gre_sep5.f:343)
15251 10.13 20 gre_full (gre_sep5.f:198)
39189414 3.40 3 ene (gre_sep5.f:316)
777969 2.64 31 measure_gim (sep15.f:1202)
777969 2.37 30 diag_dens (sep15.f:1030)
1309999 2.00 2 check (sep15.f:1739)
Then I compile the same code with -pg switch and run gprof:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
17.34 25.98 25.98 __mth_i_dexp
14.18 47.23 21.25 136384347 0.00 0.00 greenfuncz_greenf_
14.05 68.29 21.07 f90io_close
11.78 85.96 17.66 __rouexit
10.80 102.14 16.18 _mcount2
4.88 109.46 7.32 2709999 0.00 0.00 main_check
4.84 116.70 7.25 15251 0.00 0.00 greenfuncz_gre_full_
4.38 123.27 6.57 __mth_dsincos
2.48 126.99 3.72 487531869 0.00 0.00 greenfuncz_ex_
First of all, what are these (service?) functions mcount2, f90io_close and __rouexit? And do they take time (as gprof shows) or don't they (as pgprof shows)?
Also, does the function lint() take 10% of time (pgprof) or doesn't it (gprof?)
Thanks a lot in advance!
Zhenya |
|
| Back to top |
|
 |
mleair
Joined: 19 Jul 2004 Posts: 67 Location: Portland Group
|
Posted: Mon Sep 18, 2006 10:44 am Post subject: profiling |
|
|
Hi Zhenya,
To answer your question, let's first discuss the differences between -Mprof=func and -pg.
-Mprof=func generates an instrumented profile. We instrument the code. Essentially we wrap timers around each call site in your program. Compiling with -pg or -Mprof=time generates a sample based profile. Timers are not wrapped around each call site. Instead, a periodic snap shot of the running program's state is recorded. Sample based profiling, therefore, is less intrusive. Moreover, the time to profile your program is less with sample based than instrumentation. Also for programs that do not run for a long time, sample based profiling usually provides more accurate times since there's less profiling overhead. Instrumentation has the advantage of recording time for every call-site; allowing you to see the true call path of your program.
Yes, the _mcount2 function takes time. They comes from the gprof profiling mechanism. It's not a part of the -Mprof=func mechanism (so they are not a part of your -Mprof=func profile). Also -Mprof=func only records times in your user program. -pg/-Mprof=time may also catch some of the pgf90 runtime routines (e.g., f90io_close, etc.)
As for function lint(), it's possible that the 10% you're seeing in -Mprof=func is coming from use of some of the intrinsics/runtime routines recorded in the -pg profile. That's because -Mprof=func only records times by your user program. It does not record times spent by the runtime.
I hope this helps. Obviously no two profiling mechanisms are alike, but hopefully both will steer you in the right direction when finding performance bottle necks in your code. If in doubt, it's always good to try both as you did. When given a choice, I probably would use -Mprof=time (or -pg).
-Mark |
|
| Back to top |
|
 |
kolya
Joined: 25 May 2005 Posts: 2
|
Posted: Mon Sep 18, 2006 5:14 pm Post subject: |
|
|
Hello Mark,
Thank you very very much for such a detailed explanation! The whole thing makes sence now.
Only one tiny thing's left: is f90io_close routine related to printing to stdout (i.e. via print*,'blah') or only to the disk i/o?
A zillion thanks,
Zhenya |
|
| Back to top |
|
 |
mleair
Joined: 19 Jul 2004 Posts: 67 Location: Portland Group
|
Posted: Mon Sep 18, 2006 6:54 pm Post subject: |
|
|
| kolya wrote: | Hello Mark,
Thank you very very much for such a detailed explanation! The whole thing makes sence now.
Only one tiny thing's left: is f90io_close routine related to printing to stdout (i.e. via print*,'blah') or only to the disk i/o?
A zillion thanks,
Zhenya |
Hi Zhenya,
f90io_close is called usually to perform a Fortran CLOSE statement.
-Mark |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|