PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Volatile in Kernels

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Denis Altmann



Joined: 25 Nov 2004
Posts: 3

PostPosted: Fri Jul 15, 2011 1:01 am    Post subject: Volatile in Kernels Reply with quote

I tried to create a device subroutine that synchronizes multiple cuda blocks. For this to work it is necessary to declare a variable volatile:

---
attributes(device) subroutine syncblocks(syncval)

implicit none

integer,volatile :: syncval
integer :: dummy

call threadfence()
call syncthreads()
if (threadIdx%x==1) then
dummy=atomicAdd(syncval,1)
do
if (syncval == gridDim%x) then
exit
end if
end do
end if
call syncthreads()

end subroutine syncblocks
---

If I write the equivalent in CUDA C it works quite fine:

---
__device__ void __syncblocks(int* syncval) {
__threadfence();
__syncthreads();
if (threadIdx.x==0) {
atomicAdd(syncval,1);
while(*(volatile int*)syncval<gridDim.x) {
}
}
__syncthreads();
}
---

Unfortunately the Fortran version does not work. This is because the volatile keyword is simply dropped during the translation to CUDA C. The above kernel results in:

---
extern "C" __device__ void syncblocks(signed char* _psyncval)
{
int dummy;
int xthreadidx_x;
int xgriddim_x;
xthreadidx_x = (int)(threadIdx.x+1);
xgriddim_x = (int)gridDim.x;
__threadfence();
__syncthreads();
if( ((xthreadidx_x)!=(1))) goto _BB_6;
dummy = __pgi_atomicAddi((signed char*)(_psyncval), 1);
_BB_5: ;
if( (((*(int*)(_psyncval)))!=(xgriddim_x))) goto _BB_5;
_BB_6: ;
__syncthreads();
}
---

If volatile is not supported for device code in CUDA Fortran, I think the compiler should give an error or at least a warning.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6213
Location: The Portland Group Inc.

PostPosted: Mon Jul 18, 2011 1:36 pm    Post subject: Reply with quote

Hi Denis,

I talked with our engineers and they will get this fixed in August's 11.8 release. Volatile should get passed to the generated CUDA C code.

Best Regards,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group