|
| View previous topic :: View next topic |
| Author |
Message |
Minh
Joined: 01 Mar 2013 Posts: 9
|
Posted: Tue Apr 16, 2013 9:05 am Post subject: Occupancy with pgc++13.2 ? |
|
|
Hello,
I have serveral questions related to pgc++13.2. Suggestions are greatly welcome!
1. I found that pgc++13.2 changed the resource configuration (e.g. number of block, number of thread) of OpenACC pragmas augmented code in runtime.
Feedback from the compiler shows me that the compiler agreed with me on resource configuration grid(200,100) block(32,16,1):
| Code: | 26, #pragma acc loop gang(100), vector(16) /* blockIdx.y threadIdx.y */
28, #pragma acc loop gang(200), vector(32) /* blockIdx.x threadIdx.x */
|
But, when I profiled the program with nvprof, the resource configuration was changed to grid(256,512) block(32,16,1). Does pgc++13.2 noftify users on this change?
| Code: | | 9e+09s 0ns (256 512 1) (32 16 1) 39 0B 0B - - 0 1 2 _Z8t1_f_acciiiPPdS0_S0_S0_S0_S0__30_ gpu |
2. Base on which criteria, the compiler decides the appropriate resource configuration?
3. Which flags produce the following feedback from the compiler? Please tell me how the occupancy is calculated during the compile time.
| Quote: | | CC 2.0 : 26 registers; 8 shared, 92 constant, 0 local memory bytes; 33% occupancy |
Thank you very much,
Best regards,
Minh |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Tue Apr 16, 2013 11:00 am Post subject: |
|
|
Hi Mihn,
| Quote: | | 1. I found that pgc++13.2 changed the resource configuration (e.g. number of block, number of thread) of OpenACC pragmas augmented code in runtime. | This is a known issue and currently being tracked as TPR#19149. We're currently expecting to have this fixed in the 13.5 release.
| Quote: | | 3. Which flags produce the following feedback from the compiler? Please tell me how the occupancy is calculated during the compile time. | We stopped printing the occupancy for various implementation reasons. Though, we could probably bring it back. Was it something you found useful?
You can still get the ptxinfo via the flag "-Mcuda=ptxinfo". With the register and shared memory info, you can then calculate the occupancy via the CUDA Occupancy Calculator
- Mat |
|
| Back to top |
|
 |
Minh
Joined: 01 Mar 2013 Posts: 9
|
Posted: Wed Apr 17, 2013 10:22 am Post subject: |
|
|
Hi Mat,
I think occupancy information is still useful in case the user knows that the archieved memory bandwidth less than peak bandwidth and he wants to increase the occupancy.
I use pgc++ 13.2, and haven't found the flag -Mcuda in pgc++ manual page yet. Is this the new flag on the newest version of pgc++?
Best regards,
Minh |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Wed Apr 17, 2013 11:26 am Post subject: |
|
|
| Quote: | | I use pgc++ 13.2, and haven't found the flag -Mcuda in pgc++ manual page yet. Is this the new flag on the newest version of pgc++ | Sorry, I missed that you're using C++. -Mcuda is for CUDA Fortran and CUDA x86, not C++.
| Quote: | | I think occupancy information is still useful in case the user knows that the archieved memory bandwidth less than peak bandwidth and he wants to increase the occupancy. | Ok, I'll pass that along. It will be a lower priority item, but should be put back in a some point.
- Mat |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|