| View previous topic :: View next topic |
| Author |
Message |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Thu May 02, 2013 10:44 am Post subject: |
|
|
| Quote: | | The former includes an interface for "sgemm" routine, while the latter includes a "use cublas" statement with no interface. could you please quickly brief me on the difference of these two? | That's the difference. One shows how to use an interface and one uses the cublas interface module. The cudblas module is something PGI created as a convenience for users especially since NVIDIA kept changing the interface to CUBLAS.
| Quote: | | Secondly, my final intention is to use the "CuSparse" library. Is there any example of using that with PGI compiler I can have a look at? 'cause the "cusparse" is much more complicated than "cublas". |
You'll need to either add an interface for the CUSPARSE routines or use the C interface file NVIDIA provides. http://docs.nvidia.com/cuda/cusparse/index.html#topic_14
- Mat |
|
| Back to top |
|
 |
OmidKar
Joined: 23 Jan 2013 Posts: 10
|
Posted: Thu May 02, 2013 4:11 pm Post subject: |
|
|
| Quote: | | The cudblas module is something PGI created as a convenience for users especially since NVIDIA kept changing the interface to CUBLAS. |
Do we have access to this module? where can I find it to make some changes? Is a similar module provided by PGI for CUSPARSE?
Thank you,
Omid |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Thu May 02, 2013 4:28 pm Post subject: |
|
|
| Quote: | | Do we have access to this module? where can I find it to make some changes? | The cublas module file (.mod) file can be found in the include directory. However, we do not ship the source for this file.
| Quote: | | Is a similar module provided by PGI for CUSPARSE? | No. Ideally NVIDIA would do one since they manage CUSPARSE. Though, you should be able to use the interface file they provide. It's a F77 style interface, but you should be able to skip the cuda_malloc/cuda_free calls and just pass in CUDA Fortran device variables directly to the routine. Granted, I haven't tried it myself, but in theory it should just work.
- Mat |
|
| Back to top |
|
 |
|