PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Change L1 cache size in Fermi
Goto page Previous  1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Thu May 19, 2011 9:45 am    Post subject: Reply with quote

Dredging up an old thread...

mkcolg wrote:
Hi Faustus,

You should be able to access any CUDA C function from CUDA Fortran. For CUDA C functions without a built-in CUDA Fortran interface, such as cudaFuncSetCacheConfig, you simply need to write an explicit interface to the function before calling it.

- Mat


Mat, et al,

I'm trying to try and use cudaFuncSetCacheConfig to see if it affects my code at all or not (on my way to maybe trying cudaThreadSetCacheConfig). So, I first tried whipping up an interface:
Code:
module extracuda

interface
   integer function cudafuncsetcacheconfig(func, cacheconfig) bind(c,name='cudaFuncSetCacheConfig')
      use iso_c_binding

      character(len=*) :: func
      integer :: cacheconfig
   end function cudafuncsetcacheconfig
end interface

end module extracuda

And then in my code I did:
Code:
      use extracuda
...
      write (*,*) "Got here!"

      istat = cudaFuncSetCacheConfig('soradcuf',2)

      call soradcuf<<<dimGrid, dimBlock>>>(...args...)
      istat = cudaGetLastError()
      if (istat /= 0) then
         write (*,*) "Kernel Call failed: ", cudaGetErrorString(istat)
         stop
      end if

where I'm pretty sure cudaFuncCachePreferL1 = 2 in CUDA enum speak.

I linked everything up with the usual -lcudart and -lcuda and I get:
Code:
> ./runsorad-cudafor-flxy-lessconstants-funccache-DPvDPorig.exe
 blocksize:           256
 Current Device:             0

Device:Tesla M2070, 1147.0 MHz clock, 4096.0 MB memory.

 Iteration:             1
 eps:    1.0000000E-06
 Current Device:             0
 Got here!
 Kernel Call failed:
 invalid device function                                                                                                         
Warning: ieee_underflow is signaling
Warning: ieee_inexact is signaling
FORTRAN STOP

Now, if I comment out my cudaFuncSetCacheConfig call, the code works just fine, so I broke something with the interface (as per usual when I try to do C interfacing).

Any help from the gurus out there of what I might have screwed up?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6119
Location: The Portland Group Inc.

PostPosted: Fri May 20, 2011 3:51 pm    Post subject: Reply with quote

Hi Matt,

In 11.4 we added an interface to cudaFuncSetCacheConfig in the cudafor module. You can now call the routine directly.

Though, the error suggests that cudaFuncSetCacheConfig is getting called but is using an unknown kernel. While I don't know details, in looking at our wrapper function it seems we manipulate the Fortran string into a integer array which is then passed to the CUDA C cudaFuncSetCacheConfig function.

Can you try using the CUDA Fortran module's interface and see if that works around the issue?

Thanks,
Mat
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Mon May 23, 2011 12:21 pm    Post subject: Reply with quote

mkcolg wrote:
Hi Matt,

In 11.4 we added an interface to cudaFuncSetCacheConfig in the cudafor module. You can now call the routine directly.

Though, the error suggests that cudaFuncSetCacheConfig is getting called but is using an unknown kernel. While I don't know details, in looking at our wrapper function it seems we manipulate the Fortran string into a integer array which is then passed to the CUDA C cudaFuncSetCacheConfig function.

Can you try using the CUDA Fortran module's interface and see if that works around the issue?


I'll let you know soon (the Fermi system is getting 11.5 installed on it as soon as the sysadmins have the time)!

Thanks,
Matt
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Tue May 24, 2011 4:36 am    Post subject: Reply with quote

mkcolg wrote:
In 11.4 we added an interface to cudaFuncSetCacheConfig in the cudafor module. You can now call the routine directly.

Though, the error suggests that cudaFuncSetCacheConfig is getting called but is using an unknown kernel. While I don't know details, in looking at our wrapper function it seems we manipulate the Fortran string into a integer array which is then passed to the CUDA C cudaFuncSetCacheConfig function.

Can you try using the CUDA Fortran module's interface and see if that works around the issue?

Okay, the compiler recognizes the call, which is good, but it crashes in varying ways trying to use it. First off, the only difference between the working code and the non-working code with the cudaFuncSetCacheConfig call is:
Code:
write (*,*) "Got here!"

istat = cudaFuncSetCacheConfig(soradcuf,cudaFuncCachePreferL1)

Following the cudaFuncSetCacheConfig call is my kernel call to soradcuf:
Code:
call soradcuf<<<dimGrid, dimBlock>>> (...args...)
istat = cudaGetLastError()
if (istat /= 0) then
   write (*,*) "Kernel Call failed: ", cudaGetErrorString(istat)
   stop
end if

When I compile that (with no errors) and run it I get:
Code:
> ./runsorad-cudafor-flxy-lessconstants-funccache-DPvDPorig.exe
 blocksize:           256
 Current Device:             0

Device:Tesla M2070, 1147.0 MHz clock, 4096.0 MB memory.

 Iteration:             1
 eps:    1.0000000E-06
 Current Device:             0
 Got here!
Segmentation fault (core dumped)

Hmm. This cored out before even getting to the error write under the kernel call. Maybe it needs strings and integers? Let's try:
Code:
istat = cudaFuncSetCacheConfig('soradcuf',2)
which leads to:
Code:

> ./runsorad-cudafor-flxy-lessconstants-funccache-DPvDPorig.exe
 blocksize:           256
 Current Device:             0

Device:Tesla M2070, 1147.0 MHz clock, 4096.0 MB memory.

 Iteration:             1
 eps:    1.0000000E-06
 Current Device:             0
 Got here!
 Kernel Call failed:
 invalid device function
Warning: ieee_underflow is signaling
Warning: ieee_inexact is signaling
FORTRAN STOP

Well, that died out trying to call the function and got to the error write and stop.

Any hints?

Matt
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6119
Location: The Portland Group Inc.

PostPosted: Tue May 24, 2011 10:48 am    Post subject: Reply with quote

Hi Matt,

I'm not sure what's wrong. I haven't used this function much but my little test cases seem fine. Instead of me trying to figure out how to recreate the error, can you send me your tests?

Thanks,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page Previous  1, 2, 3  Next
Page 2 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group