PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Using atomic memory functions
Goto page 1, 2, 3, 4, 5  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
crip_crop



Joined: 28 Jul 2010
Posts: 68

PostPosted: Thu Jul 05, 2012 5:34 am    Post subject: Using atomic memory functions Reply with quote

Hi there,

I basically need to use atomic memory functions to prevent race condition in my device code but I haven't been able to because NVIDIA only support these functions in single precision.

However, I've recently become aware that there is a way of converting double precision data so that it is stored in two integer/single precision memory slots....hence solving that issue. The problem now is that I don't have a clue how to go about doing the 1*double-to-2*single exchange.

A while back I was porting some C code to gpu and I remember that to use texture memory I had to save the double precision data as single and I used the int2 intrinsic to declare the texture followed by __hiloint2double when fetching the texture.

Does anyone have any idea how to do this when not using texture memory? Or has anyone ever done anything similar to what I'm trying to achieve?

Cheers,
Crip_crop
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6126
Location: The Portland Group Inc.

PostPosted: Thu Jul 05, 2012 2:35 pm    Post subject: Reply with quote

Hi Crip Crop,

Sorry, but I've not tried this before. Have you seen anything on doing this in CUDA C? We might be able to then translate it to CUDA Fortran.

Note that we're really close to getting textures into CUDA Fortran. The push to get OpenACC fully implemented did delay textures a bit, but once OpenACC 1.0 is out, we should be able to get back and finish it up.

- Mat
Back to top
View user's profile
crip_crop



Joined: 28 Jul 2010
Posts: 68

PostPosted: Fri Jul 06, 2012 3:37 am    Post subject: Reply with quote

Well the only example I can give at the moment is the bit of CUDA C I did a couple of years back but it involves textures. It may be useful though...

Code:


[b]
File scope[/b]

*Declaring texture references for arrays a and b*/
texture<int2,1> texRefa;
texture<int2,1> texRefb;

[b]
Host code[/b]

/*Declaring texture reference object for a*/
cudaBindTexture(0, texRefa, Ad, size);

/*Declaring texture reference object for b*/
cudaBindTexture(0, texRefb, Bd, size);

[b]
Device code[/b]

/*Fetching a and b values from texture memory, accumulating result in sum*/
int2 sha = tex1Dfetch(texRefa, ii+k);
int2 shb = tex1Dfetch(texRefb, jj+k);
(sum)= (sum) + (__hiloint2double(sha.y,sha.x)) * (__hiloint2double(shb.y,shb.x));
}



Although this is quite different from what I need it might give you an idea of what I'm trying to achieve (but minus the texture and plus an atomic function).

NVIDIA's forums are down for maintenance at the moment so I'm struggling to find any more suitable C examples.

Please let me know if this helps at all.

Cheers,
Crip_crop
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6126
Location: The Portland Group Inc.

PostPosted: Fri Jul 06, 2012 9:54 am    Post subject: Reply with quote

Hi Crip Crop,

Quote:
I basically need to use atomic memory functions to prevent race condition in my device code but I haven't been able to because NVIDIA only support these functions in single precision.
Sorry that I didn't point this out earlier, but NVIDIA does support 64-bit atomic operations on newer GPUs (CC 2.x). Maybe the easiest thing to do would be to upgrade your card?

- Mat
Back to top
View user's profile
crip_crop



Joined: 28 Jul 2010
Posts: 68

PostPosted: Mon Jul 09, 2012 7:38 am    Post subject: Reply with quote

Well I have a Fermi c2050 so presumably it would be supported on there.

Does this mean that this statement in the pgi cudafor u.g. is incorrect then?

Quote:
Arithmetic and Bitwise Atomic Functions
These atomic functions read and return the value of the first argument. They also combine that value with
the value of the second argument, depending on the function, and store the combined value back to the first
argument location. Both arguments must be of type integer(kind=4).


Cheers,
Crip_crop
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2, 3, 4, 5  Next
Page 1 of 5

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group