PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Course

How can I obtain the no. of cores in GPU are used currently?

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
addison827



Joined: 02 Nov 2010
Posts: 12

PostPosted: Mon Mar 14, 2011 11:36 pm    Post subject: How can I obtain the no. of cores in GPU are used currently? Reply with quote

How can I obtain the no. of cores in GPU are used currently? Any instruction? or example?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6479
Location: The Portland Group Inc.

PostPosted: Tue Mar 15, 2011 2:47 pm    Post subject: Reply with quote

Hi addison827,

While you can't tell the number of GPUs in use, you can tell the number of GPUs available via the runtime routine "acc_get_num_device" when using the PGI Accelerator Model and "cudaGetDeviceCount" in CUDA Fortran.

Hope this helps,
Mat
Back to top
View user's profile
addison827



Joined: 02 Nov 2010
Posts: 12

PostPosted: Tue Mar 15, 2011 8:09 pm    Post subject: Reply with quote

Thanks for your help. However I think my question is that I wanna know the number of processors used currently but not the device (GPUs). Like the parallel in shared memory, we can set how many cores we wanna use. Can I do the same thing wile using GPU??
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6479
Location: The Portland Group Inc.

PostPosted: Wed Mar 16, 2011 10:38 am    Post subject: Reply with quote

Hi addison827,

I think I understand what your asking for, but it's the wrong question to be asking since SMP parallel programming doesn't apply to a GPU. The question you should be asking is how do I tell the "Occupancy" of my program? Occupancy is defined as "the ratio of active warps to the maximum number of warps supported on a multiprocessor of the GPU." In other words it's how well your program is keeping all the cores busy. Low occupancy usually leads to lower performance, but high occupancy does not necessarily lead to high performance. A web search of "occupancy CUDA" will provide more details.

In CUDA programming the user does not control the number of cores being used, rather you control the number of threads created. What you want is to have enough threads in a thread block so that when one warp stalls due to a memory fetch or other operation (a warp is group of 32 threads), another warp can swapped in. So having a block size of 64, 128, or even 512 is usually better. However, your block size can be limited by register and shared memory usage. The more memory used per thread, the fewer threads you can have. In addition to the block size, you want enough blocks to fully populate all the Streaming Multiprocessors.

Michael Wolfe wrote a good concise article about the CUDA Data Parallel threading model which you might find helpful: http://www.pgroup.com/lit/articles/insider/v2n1a5.htm

So how do you tell the Occupancy? If you're using the PGI Accelerator Model, the compiler will list the occupancy in the informational output (-Minfo).

For CUDA Fortran, you can use the CUDA Occupancy Calculator (http://news.developer.nvidia.com/2007/03/cuda_occupancy_.html) or using the CUDA Profiler (i.e set CUDA_PROFILE=1 in your environment, run the program, and review the resulting cuda_profile_0.log file) Note the number of registers used can found using the "-Mcuda=ptxinfo" flag.

Hope this helps,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group