PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

confusion about Acc. Kernel Timing data ?

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
JMa



Joined: 30 Nov 2012
Posts: 22

PostPosted: Mon Jan 14, 2013 10:13 pm    Post subject: confusion about Acc. Kernel Timing data ? Reply with quote

Hi Mat & All,
How should I interpret the Acc. knernal timing data? I thougt:
total = init + kernels + data

But the following output from my run is very confusing, this total is much much more than the sum of those three. Do you know what's the "other" time beyond those three items? Where does that part of time go?

Thanks,
Jingsen

Accelerator Kernel Timing data
...
89: region entered 13100 times
time(us): total=8,203,329 init=2,125 region=8,201,204
kernels=430,774 data=769,253
w/o init: total=8,201,204 max=10,256 min=244 avg=626
90: kernel launched 13100 times
grid: [8] block: [128]
time(us): total=324,067 max=68 min=23 avg=24
91: kernel launched 13100 times
grid: [1] block: [256]
time(us): total=106,707 max=68 min=8 avg=8
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Tue Jan 15, 2013 10:04 am    Post subject: Reply with quote

Hi Jingsen,

The "total" time for an accelerator region is measured from the host while the other timers (init, kernels, data) are taken from the device driver. Hence, the delta between the two is the time spent on the host.

Now exactly where the host time is being spent is something I'm currently investigating. This large of the delta only occurs from some regions but not others. From what I can tell, the host code seems to get blocked somewhere either in our runtime libraries or in CUDA device driver. The actual time spent blocked in each iteration isn't that large, but when there is a large number of iterations, this time gets magnified. I am looking into it and hopefully can identify the problem. After that, we can determine if it's a performance bug or at least explain the behaviour so it can be avoided. I'll post more once I know more, but so far I've been unable to determine the exact cause.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group