PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Error Message: call to cuInit returned error 100: No device

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
njackson



Joined: 14 May 2010
Posts: 9

PostPosted: Mon Jul 05, 2010 3:58 pm    Post subject: Error Message: call to cuInit returned error 100: No device Reply with quote

I'm trying to build an Accelerator program for the first time and PGI Accelerator (or CUDA) doesn't appear to be seeing the NVIDIA hardware.

The build of my little test program goes okay and ends with the message: "Inner sequential loop scheduled on accelerator". But when I run the program, I get "call to cuInit returned error 100: No device".

When I run pgaccelinfo -v, I get:

CUDA Driver Version 2030
could not initialize CUDA runtime, error code=100
libamdcalcl.so not found
No accelerators found.
Check that you have installed the CUDA or CAL libraries properly
Check that your LD_LIBRARY_PATH environment variable points to the CUDA or CAL runtime installation directory
Check the permissions on your device

pgfortran -V reports "pgfortran 10.2-1 64-bit target on x86-64 Linux -tp nehalem-64" and it is running on CentOS release 5.4 (Final) (x86_64-redhat-linux-gnu GNU/Linux) on Linux release 2.6.18-164.11.1.el5.

The PGI Installation manual says that the CUDA software is installed as part of the PGI installation but it does not mention CAL. (What is CAL exactly?)

libcuda.so is present and is on the LD_LIBRARY_PATH. I don't know where to look for libamdcalcl.so, but it is not in /lib, /usr/lib, /usr/local/lib, /opt/pgi/linux86-64/10.2/lib, nor /opt/pgi/linux86-64/10.2/libso.

Please advise where I should go from here. Is there a tool that can detect the NVIDIA hardware? Once I've determined that it can be detected, how do I get the PGI Accelerator to see it? Thanks!

Regards,
Neil.

Neil L. Jackson

P.S. The hardware is two Intel Xeon 5500 Series processors and a TESLA C1060 card.

P.P.S. nvidia-installer reports that I'm running version 190.18 of the Nvidia driver (and that the latest available version is 256.35). I tried running nvidia-settings, but it complains that I'm not running X-Windows on the Nvidia card (I can't see any reason why I would) and consequently fails to provide any information.
Back to top
View user's profile
dholt



Joined: 30 Jul 2008
Posts: 15
Location: The Portland Group Inc.

PostPosted: Wed Jul 07, 2010 9:58 am    Post subject: Reply with quote

Hi Neil,

I would first try updating your NVIDIA driver to something more recent, you can download the CUDA 3.0 drivers here.

If you have the CUDA SDK installed and built, you can try running 'deviceQuery' ($SDK_INSTALL_PATH/C/bin/linux/release/deviceQuery); if it doesn't see your card then the PGI compilers won't see your card either.

If it doesn't see your card, check to make sure your driver is loaded:
Code:
lsmod | grep nvidia


You should see something similar to:
Code:
nvidia              10840968  38
i2c_core               56129  3 i2c_ec,nvidia,i2c_i801


If not, you can run (as root):
Code:
modprobe -v nvidia


Also check to make sure the devices have been created in Linux; run:
Code:
ls -l /dev/nvidia*


It should show something similar to:
Code:
crw-rw-rw- 1 root root 195,   0 Jun 21 09:53 /dev/nvidia0
crw-rw-rw- 1 root root 195,   1 Jun 21 09:54 /dev/nvidia1
crw-rw-rw- 1 root root 195, 255 Jun 21 09:53 /dev/nvidiactl


You can add devices by running (as root):
Code:
mknod /dev/nvidia0 c 195 0
mknod /dev/nvidia1 c 195 1
mknod /dev/nvidiactl c 195 255


You will also have to make sure these devices have appropriate permissions to allow you to access them. If all of that checks out, let me know and we can try something else.

(CAL is part of the AMD Stream SDK, you can just ignore any references to it).
Back to top
View user's profile
njackson



Joined: 14 May 2010
Posts: 9

PostPosted: Fri Jul 09, 2010 5:01 pm    Post subject: Reply with quote

Thank you for this information.

As you suggested, the driver was not loading; there was no entry in /proc/modules or /dev. I reinstalled the NVIDIA driver (devdriver_3.1_linux_64_256.35) and

/sbin/lsmod | grep nvid*

now gives:

Quote:
nvidia 11148864 0
i2c_core 56641 3 nvidia,i2c_ec,i2c_i801


And (although I had to add them by hand):

ls -l /dev/nvid*

now gives:

Quote:
crwxrwxrwx 1 root root 195, 0 Jul 9 16:14 /dev/nvidia0
crwxrwxrwx 1 root root 195, 1 Jul 9 16:14 /dev/nvidia1
crwxrwxrwx 1 root root 195, 255 Jul 9 16:15 /dev/nvidiact1


I also reinstalled the CUDA Toolkit (cudatoolkit_3.1_linux_64_rhel5.4) and the SDK samples (gpucomputingsdk_3.1_linux).

The Tesla card now shows up in the KDE "Device Manager" tool.

However, pgaccelinfo still gives:

Quote:
CUDA Driver Version 3010
No accelerators found.


I wasn't able to successfully build all the programs in the CUDA SDK samples -- it gets a few programs in and then linker aborts on failing to find -lGLU -- but I did pursuade the deviceQuery program to build okay. It reports:

Quote:
deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

FAILED

Press <Enter> to Quit...
-----------------------------------------------------------


I will try support at NVIDIA, but if you have any further suggestions they would be welcome. Thank you.
Back to top
View user's profile
njackson



Joined: 14 May 2010
Posts: 9

PostPosted: Mon Jul 12, 2010 10:11 am    Post subject: Problem solved Reply with quote

I found the solution to the problem half way through the CUDA_Release_Notes_3.1.txt under "Known Issues". If one isn't running X-Windows (on the accelerator) then one must run a script at startup to make the card available. (As usual, the solution was: RTFM! I had just assumed that the driver install/setup program would actually set up the card into a usable state in the system.)

The commands in this script are essentially the same as the commands dholt suggested, with possibly the only difference being the permissions on the devices.

pgaccelinfo now shows the Tesla card. (In fact it shows three Tesla C1060 cards (Devices 0, 1, and 2). This might be some sort of mirage but I suppose it's just conceivable that there are three cards in the machine -- that would explain why there are three copies of the Tesla install CD, for example. I'll need to investigate further...)

Thank you for your assistance.

Regards,
Neil.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group