PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

OpenACC runtime timings

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
nickaj



Joined: 06 Sep 2011
Posts: 21

PostPosted: Thu Aug 16, 2012 2:39 am    Post subject: OpenACC runtime timings Reply with quote

Here's (a cutdown example illustrating) something I came across today that has me perplexed.

I was checking the timings of two kernels that should have done the same amount of work and thus had very similar timings when I noticed a discrepancy. Further investigation led me to the data copies and the following.

Code:

  acc_init(acc_device_nvidia);

  for (i=0;i<n;i++){
    A[i] = rand();
    B[i] = rand();
  }

  t_start = omp_get_wtime();
#pragma acc data copy(B[0:n])
  {
  }
  t_end = omp_get_wtime();


  t1 = t_end - t_start;


  t_start = omp_get_wtime();
#pragma acc data copy(A[0:n])
  {
  }
  t_end = omp_get_wtime();


  t2 = t_end-t_start;


A and B are 1D arrays of size n ints and n is large ( eg 536870912 ). I'm using the OpenMP timer routines as I need to report some timings in the code and I've found this method to produce results very similar to that which I get from using the run-time timers (PGI_ACC_TIME's output).

The card is a Tesla C2050 running on a backend system with a job launcher and in exclusive mode so I'm fairly sure I'm not getting interference from other jobs. I'm using PGI v12.6.

The output I get is:
Code:

  main
    52: region entered 1 time
        time(us): total=280,143
                  data=278,501
/home/gd11/gd11/njohnso1/iftime.c
  main
    42: region entered 1 time
        time(us): total=320,320
                  data=279,764
acc_init.c
  acc_init
    38: region entered 1 time
        time(us): init=5,136,287


The question is, why do I get an additional ~40ms expended in the loop at 42 cf the loop at 52? I can understand the difference in the data copy time as noise in the system but I'd expect not to find any noise in the region time and for it to be pretty close to the data copy time, give or take a fixed region time overhead. I should note that on occasion I do get the values to fall roughly in line but this is the exception rather than the rule.

Cheers,
-Nick.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6146
Location: The Portland Group Inc.

PostPosted: Thu Aug 16, 2012 2:34 pm    Post subject: Reply with quote

Hi Nick,

Total time is measured from the host side while data transfer time is measured from the device. Hence, the extra time is occurring on the host. Though, I can only speculate the exact reason for the increased time. Some possible reasons:

Device to host transfers require use of pined memory and thus a host to host copy needs to be performed (virtual memory to pinned physical memory). The OS may need to reap this memory before getting the second array.

It could be a caching issue where B is cached while A is coming from main memory.

The OS could be interrupting the code while the copy is occurring and not getting back to code for a bit longer.

One thing to try is rearranging the code a bit:
Code:

  acc_init(acc_device_nvidia);

  for (i=0;i<n;i++){
    B[i] = rand();
  }

  t_start = omp_get_wtime();
#pragma acc data copy(B[0:n])
  {
  }
  t_end = omp_get_wtime();


  t1 = t_end - t_start;


  for (i=0;i<n;i++){
    A[i] = rand();
  }

  t_start = omp_get_wtime();
#pragma acc data copy(A[0:n])
  {
  }
  t_end = omp_get_wtime();


  t2 = t_end-t_start;


If it is a memory caching issue, I would expect this code to equal out the CPU time.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group