|
| View previous topic :: View next topic |
| Author |
Message |
xlapillonne
Joined: 16 Feb 2011 Posts: 69
|
Posted: Tue Mar 26, 2013 10:33 am Post subject: error 709: Context is destroyed or not yet created - pgi13.3 |
|
|
Hi,
We have a MPI code which is mixing part in Fortran and part in Cuda. Before the MPI init we need to set the device by calling cudaSetDevice.
We then also call acc_set_device_num to set the device for the directives (it seems to be necessary to call both).
This used to work fine with pgi12.10, however with 13.3 I am getting an error at runtime.
I was able to reproduce the problem in a simple code (wihtout mpi):
| Code: |
!test setdevice
program main
use openacc
implicit none
integer :: ndev, mydev,ierr
enum, bind(C) !:: cudaError
enumerator :: cudaSuccess=0
end enum ! cudaError
interface ! [['cudaError_t', None], 'cudaSetDevice', [['int', None, 'device']]]
function cudaSetDevice(device) result( res ) bind(C, name="cudaSetDevice")
use, intrinsic :: ISO_C_BINDING
import cudaSuccess
implicit none
integer(c_int), value :: device
integer (KIND(cudaSuccess)) :: res
end function cudaSetDevice
end interface
mydev=0
ierr = cudaSetDevice(mydev)
if (ierr>0) print*, 'Error with cudaSetDevice'
ndev=acc_get_num_devices(acc_device_nvidia)
print*, 'ndev=',ndev
call acc_set_device_num(mydev,acc_device_nvidia)
print*, 'devid=', mydev
end program main
|
It is compiled as follow:
| Code: |
pgf90 -ta=nvidia -acc -o test_setdevice test_setdevice.f90 -L$CUDALIB -lcudart -lcuda
|
With 12.10 I get:
| Code: |
mpiexec -n 1 ./test_setdevice
ndev= 2
devid= 0
|
With 13.3:
| Code: |
mpiexec -n 1 ./test_setdevice
call to cuMemAlloc returned error 709: Context is destroyed or not yet created
|
Any idea why this could work with 12.10 and not anymore with 13.3 ?
Thanks,
Xavier |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Tue Mar 26, 2013 2:08 pm Post subject: |
|
|
Hi Xavier,
I'm wondering if it's a mismatch between CUDA versions. It seems to work for me when using 13.1 and CUDA 5.0 (12.10 uses CUDA 4.1 by default). I'll need to install an earlier version of CUDA to see if I can reproduce your error but am short on time today. Which CUDA version are you using?
| Code: | % pgfortran -ta=nvidia,5.0 -acc -o test_setdevice test_setdevice.f90 -V13.1 -L/opt/cuda-5.0/lib64 -lcudart -lcuda
% mpirun -n 1 a.out ndev= 2
devid= 0
|
- Mat |
|
| Back to top |
|
 |
xlapillonne
Joined: 16 Feb 2011 Posts: 69
|
Posted: Wed Mar 27, 2013 10:11 am Post subject: |
|
|
Hi Mat,
I've just tried with CUDA 5
| Code: |
pgf90 -ta=nvidia,5.0 -acc -o test_setdevice test_setdevice.f90 -L/apps/castor/CUDA-5.0/lib64 -lcudart -lcuda
|
and I still get the error. Note that I am using 13.3.
I've tried with 13.2 and 13.1 and indeed it works for these older versions.
Xavier |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Wed Mar 27, 2013 11:23 am Post subject: |
|
|
Hi Xavier,
I was able to reproduce the issue in 13.3. I'm not entirely sure what's going on here since if you link with "-Mcuda", it works fine. I've added at problem report (TPR#19236) and sent it on for investigation.
- Mat
| Code: | % pgfortran -ta=nvidia,5.0 -acc -o test_setdevice test_setdevice.f90 -V13.3 -L/opt/cuda-5.0/lib64 -lcudart -lcuda
% test_setdevice
call to cuMemAlloc returned error 709: Context is destroyed or not yet created
% pgfortran -ta=nvidia,5.0 -acc -o test_setdevice test_setdevice.f90 -V13.3 -L/opt/cuda-5.0/lib64 -lcudart -lcuda -Mcuda
% test_setdevice
ndev= 2
devid= 0 |
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|