PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

finding executed time using PGI_ACC_TIME

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Tae Hyeok, Jang



Joined: 28 Jan 2014
Posts: 4

PostPosted: Sun Feb 09, 2014 11:27 pm    Post subject: finding executed time using PGI_ACC_TIME Reply with quote

Hi, I'm new to open ACC.

For testing open ACC compiler well and how fast it makes a program,

I compiled with pcgcc -acc -o test03.exe test03.c -ta=tesla:cc1x -Minfo=accel, and executed it.

But when varying n from 1 to 10^6, the time measured by using PGI_ACC_TIME doesn't satisfy me.

I heard when n is small number, there must be some overhead for sending and getting datas from host to device, and versus.

Approximately, the time varys 41 at 1 times, 41 at 10 times, 41 at 100 times, 82 at 1000 times, about 400 at 10000 times, about 4000 at 100000 times.

Any problems in my code? or any advices for me?

I really really appreciate to your very kind reply, in advance :).



Code:
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include "stdafx.h"
#include <windows.h>

int main( int argc, char* argv[] )
{

    int n;      /* size of the vector */
    float *a;  /* the vector */
    float *restrict r;  /* the results */
    float *e;  /* expected results */
    int i;

    if( argc > 1 )
        n = atoi( argv[1] );
    else
        n = 1000;
    if( n <= 0 ) n = 1000;

    a = (float*)malloc(n*sizeof(float));
    r = (float*)malloc(n*sizeof(float));
    e = (float*)malloc(n*sizeof(float));
    /* initialize */
    for( i = 0; i < n; ++i ) a[i] = (float)(i+1);

   printf("start!!\n");

#pragma acc kernels loop
    for( i = 0; i < n; ++i ) r[i] = a[i]*2.0f;
    /* compute on the host to compare */
    for( i = 0; i < n; ++i ) e[i] = a[i]*2.0f;
    /* check the results */
    for( i = 0; i < n; ++i )
   {
        assert( r[i] == e[i] );
   
   }

   printf("end!!\n");
   printf( "%d iterations completed\n", n );

   
    return 0;
}
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Mon Feb 10, 2014 10:43 am    Post subject: Reply with quote

Hi Tae Hyeok, Jang,

Note that there's very little compute here so it's not a great example to show performance. There is some overhead but the times you're seeing are having more to do with GPU saturation. It looks to take approximately 41ms per block so once you saturate the GPU, the remaining blocks still need to run, hence the 82 ms. If you graph the times, you'd most likely see a stair step.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group