PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Strong typing and memory copy
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
SarahA



Joined: 29 Aug 2006
Posts: 16

PostPosted: Thu Mar 25, 2010 8:59 am    Post subject: Reply with quote

mkcolg wrote:
Hi Sarah,

Quote:
It doesn't help that EQUIVALENCE is documented as unsupported for CUDA 2.3 ... it does compile without error, but it doesn't seem to work. Of course, I could have another error.
The compiler should catch this and give an error if EQUIVALENCE is used. I've submitted a problem report (TPR#16726) to have this fixed.

Hope this helps,
Mat


It would help more to support EQUIVALENCE. I have tested it, and for the uses I have, it does seem to work. I can understand why device COMMON is not supported, but given all the other limitations EQUIVALENCE would be very nice to have.
Back to top
View user's profile
SarahA



Joined: 29 Aug 2006
Posts: 16

PostPosted: Thu Mar 25, 2010 9:05 am    Post subject: Reply with quote

Quote:

Well then, at the risk of my questions getting dumber and dumber...
How does one go about specifying and calling CUDA C kernels from a CUDA Fortran program?
Sarah


To answer my own question, here is an example. It was fairly easy, and yes, it was a dumb question.

nvcc -c csub.cu
pgfortran afort.CUF csub.o

afort.CUF
Code:

      program afort
      real, dimension(1000) :: this
      real, device, dimension(1000) :: dthis
      integer i

      do i = 1, 1000
         this(i)= i
      enddo

      dthis= this
      call addone( dthis )
      this = dthis
      print *, this
      return
      end


csub.cu

Code:
#include <stdio.h>
#include <stdlib.h>

#include <cuda.h>

__global__ void addone_kernel( float *data ) {
  int idx = blockIdx.x*blockDim.x + threadIdx.x;
  if ( idx<1000 ) data[idx] = data[idx] - 1000.0f;
}

extern "C" {

void addone_( float *data ) {
  addone_kernel<<<50,200>>>( data );
  return;
}
}
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6119
Location: The Portland Group Inc.

PostPosted: Mon Mar 29, 2010 3:40 pm    Post subject: Reply with quote

Hi Sarah,

Quote:
To answer my own question, here is an example. It was fairly easy, and yes, it was a dumb question.
It's actually one of our more common questions, so not dumb at all. Most users make it more difficult then it really is. As your example shows, it's not much different then standard Fortran to C interoperability. For another example, I wrote an article (http://www.pgroup.com/lit/articles/insider/v2n1a4.htm) that has a CUDA Fortran program calling a CUDA C random number generator.

As for Equivalence, we do have a feature request in for it (TPR#16198). I'll add a note that more users are asking for it and bump up it's priority. It's my understanding that it will be fairly difficult to implement since there isn't a equivalent method to perform this in C.

Thanks,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group