PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

calling cublas from parallel regions.

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
GPGPU is good



Joined: 07 May 2013
Posts: 1

PostPosted: Tue Jun 17, 2014 10:57 am    Post subject: calling cublas from parallel regions. Reply with quote

Hello all,
I am trying to call cuBLAS library device routines from openacc parallel regions in C. My proof-of concept code is not compiling, and my trials has lead me to the following information and code :

    I understand that i need to link to the cublas_device and cudadevrt.
    linking to cublas_device only works if I use the second API version of cublas.
    the second API version of cublas needs declaring a cublascontext handle. :
    However, the use of cublas context handle is generating errors during compilation.

here is my current code.the compilation output follows that :
Code:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <cuda_runtime.h>
#include <cublas_v2.h>

#pragma acc routine (cublasSaxpy) seq
#pragma acc routine (cublasCreate) seq
#pragma acc routine (cublasDestroy) seq

int main(int argc, char **argv) {

   float *x, *y, alpha = 2, *ptr_alpha;
   ptr_alpha = &alpha;
   int n = 1 << 20, i;

   x = (float*) malloc(n * sizeof(float));

   y = (float*) malloc(n * sizeof(float));

#pragma acc data create(x[0:n]) copyout(y[0:n]) copyin(ptr_alpha[0:1])
   {
#pragma acc kernels
      {
#pragma acc loop independent
         for (i = 0; i < n; i++) {
            x[i] = 1.0f;
            y[i] = 0.0f;
         }
      }

#pragma acc parallel num_gangs(1)
      {
         cublasHandle_t cnpHandle;
         int status;

         status = cublasCreate(&cnpHandle);

         if (CUBLAS_STATUS_SUCCESS == status) {
            /* Perform operation using cublas */
            cublasSaxpy(cnpHandle, n, ptr_alpha, x, 1, y, 1);
            cublasDestroy(cnpHandle);
         }
      }
   }

   fprintf(stdout, "y[0] = %f\n", y[0]);
   free(x);
   free(y);
   return 0;
}


the output I am getting from pgi is :
Quote:
PGCC-S-0107-Struct or union cublasContext not yet defined (callcublas3.c: 16)
PGCC-S-0155-Cannot determine bounds for array cnpHandle (callcublas3.c: 47)
PGCC-S-0155-Cannot determine bounds for array cnpHandle (callcublas3.c: 47)
PGCC-S-0155-Cannot determine bounds for array cnpHandle (callcublas3.c: 47)


The compilation command I am using is :
Code:
pgc++ -Minfo=all -Mcuda  -ta=tesla:cc35,cuda5.5 -I ~/installed/pgi/linux86-64/2014/cuda/5.5/include  -L ~/installed/pgi/linux86-64/2014/cuda/5.5/lib64 callcublas3-2.c -lcublas_device -lcudadevrt 


I was able to eliminate the error about cnpHandle bounds by defining the handle outside the ACC data regeion and copying it using [:1] . but i couldn't fix the error about the cublas contex yet . it also seems strange to me that pgi is trying to copy the cnpHandle as it is being declared inside the parallel region.

is it possible to get around this error ?.

I am thinking that maybe compiling a wrapper device function via nvcc and calling it from ACC would work. but I am wondering if it can be done plainly in that code without a wrapper.

Thank you.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue Jun 17, 2014 3:15 pm    Post subject: Reply with quote

Hi GPGPU is good,

Having the ability to call the cuBlas device routines (as well as other CUDA C device routines) has been a goal for some time now and one of the reasons the "routine" pragma was created. However, "routine" is very new and we only have basic support available. 14.7 will expand this support but I think it will be a bit longer before we can get to the point where we can get your example to work as is.

I added TPR#20600 to track your example and sent it on to engineering.

Thanks!
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group