PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Error running simple CUDA Fortran program
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
TheMatt



Joined: 06 Jul 2009
Posts: 306
Location: Greenbelt, MD

PostPosted: Tue Feb 23, 2010 11:14 am    Post subject: Error running simple CUDA Fortran program Reply with quote

At present, I'm looking at converting a program that works with Accelerators to CUDA Fortran. I'm not sure it'll be useful in the long run, but the experience will be well worth it. Unfortunately, I got caught with an ICE:
Code:
PGF90-F-0000-Internal compiler error. unexpected runtime function call

Okay, I'll probably report that to Support if I can't figure it out. But then I wondered about whether my PGI setup might be wonky. So I tried this simple program which has worked before:

Code:
module assign_mod
   use cudafor
contains
   attributes(global) subroutine assign_kernel(A, N)
      implicit none

      integer, value :: N
      integer, device, dimension(N) :: A
      integer, device :: idx

      idx = (blockidx%x - 1) * blockdim%x + threadidx%x

      if (idx <= N) A(idx) = blockidx%x * blockdim%x + threadidx%x
   end subroutine assign_kernel
end module assign_mod

program main
   use cudafor
   use assign_mod

   implicit none

   integer, parameter :: n = 32
   integer, allocatable, dimension(:) :: a_host, b_host
   integer, device, allocatable, dimension(:) :: a_device

   type(dim3) :: dimGrid, dimBlock

   integer, parameter :: blocksize = 4

   integer :: i

   dimBlock = dim3(blocksize,1,1)
   dimGrid = dim3(n/blocksize,1,1)

   allocate(a_host(n))
   allocate(b_host(n))

   allocate(a_device(n))

   forall (i=1:n)
      a_host(i) = 99
   end forall

   a_device = a_host

   call assign_kernel<<<dimGrid, dimBlock>>> (a_device, n)

   b_host = a_device

   write (*,"(I2,1X)",advance="no") b_host
   write (*,*)

end program main

But, when I try to compile and run it:
Code:
> pgfortran trial.cuf
> ./a.out
0: ALLOCATE: 128 bytes requested; status = 35

Can you help me figure out what I've done to wreck my CUDA Fortran setup? (Another example: running the cufinfo.cuf example from 10.2 does nothing on first run and then dumps core when you run it again.)

FYI, I'm running 10.2 and my environment looks like:
Code:
> env | grep -i pgi
MANPATH=/usr/share/man:/usr/local/share/man:/usr/X11R6/man:/opt/pgi/linux86-64/2010/man
LD_LIBRARY_PATH=/home/mathomp4/lib:/opt/pgi/linux86-64/2010/mpi/mpich/lib:/opt/pgi/linux86-64/2010/cuda/lib:/opt/pgi/linux86-64/2010/cuda/open64/lib:/opt/pgi/linux86-64/2010/lib:/opt/pgi/linux86-64/2010/libso:/opt/cuda/lib64::/home/mathomp4/GMAO-Baselibs-3_1_5/Linux/lib:/opt/pgi/linux86-64/2010/mpi/mpich/lib:/opt/cuda/lib64
PGI=/opt/pgi
PATH=.:/home/mathomp4/bin:/home/mathomp4/cvstools:/home/mathomp4/opengrads:/opt/pgi/linux86-64/2010/bin:/opt/pgi/linux86-64/2010/mpi/mpich/bin:/home/dkokron/play/pdt/pdt-3.15/x86_64/bin:/home/dkokron/play/tau/tau-2.19/x86_64/bin:/home/mathomp4/Fortuna/GEOSagcm/src/GMAO_Shared/GEOS_Util/post:/home/mathomp4/Fortuna/GEOSagcm/src/GMAO_Shared/GEOS_Util/plots:/opt/cuda/bin:/home/mathomp4/bin:/opt/pgi/linux86-64/2010/bin:/opt/pgi/linux86-64/2010/mpi/mpich/bin:/opt/pgi/linux86-64/2010/cuda/bin:/opt/cuda/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/mathomp4/bin
LM_LICENSE_FILE=/opt/pgi/license.dat
PGIABBR=/opt/pgi/linux86-64/2010

Thanks,
Matt
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue Feb 23, 2010 12:31 pm    Post subject: Reply with quote

Hi Matt,

Quote:
PGF90-F-0000-Internal compiler error. unexpected runtime function call
Most likely you're using an unsupported device intrinsic like FRACTION or EXPONENT. Can you determine which intrinsic is causing the error? I can then push engineering to get this one bumped up in priority.

Quote:
0: ALLOCATE: 128 bytes requested; status = 35
This is a runtime error meaning that the allocate failed with status 35. I'm assuming this is coming from the device array's allocate, in which case status 35 is coming from a call to cudaMalloc and means "cudaErrorInsufficientDriver: CUDA runtime is newer than driver".

On occasion the driver will stop working correctly, so the first thing I'd do is reboot. If that doesn't work, can you please post your NVIDIA driver version?
Code:
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  195.17  Mon Oct 26 06:19:11 PST 2009
GCC version:  gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)


Thanks,
Mat
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 306
Location: Greenbelt, MD

PostPosted: Tue Feb 23, 2010 1:13 pm    Post subject: Reply with quote

mkcolg wrote:
Hi Matt,

Quote:
PGF90-F-0000-Internal compiler error. unexpected runtime function call
Most likely you're using an unsupported device intrinsic like FRACTION or EXPONENT. Can you determine which intrinsic is causing the error? I can then push engineering to get this one bumped up in priority.

I'll take a look. The code is the code you've seen from me before but I've had to transform it from F77 into F90-esque code for my sanity. Entirely possible there is an intrinsic I'm missing. I've converted some FLOAT and DBLE calls to just pure REAL in case those did it, but I'm still getting the error. All that's left are more tame MAX, MIN, LOG10, EXP, SQRT, etc. Could the fact I'm still using the old DATA calls to assign arrays (rather than RESHAPE) do it?

Quote:
Quote:
0: ALLOCATE: 128 bytes requested; status = 35
This is a runtime error meaning that the allocate failed with status 35. I'm assuming this is coming from the device array's allocate, in which case status 35 is coming from a call to cudaMalloc and means "cudaErrorInsufficientDriver: CUDA runtime is newer than driver".

On occasion the driver will stop working correctly, so the first thing I'd do is reboot. If that doesn't work, can you please post your NVIDIA driver version?
Code:
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  195.17  Mon Oct 26 06:19:11 PST 2009
GCC version:  gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)


Ah ha. A reboot or more might be needed:
Code:
> cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  185.18.14  Wed May 27 01:23:47 PDT 2009
GCC version:  gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)
Is that version of the driver too old for 10.2?

Last edited by TheMatt on Wed Feb 24, 2010 4:53 am; edited 1 time in total
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue Feb 23, 2010 3:29 pm    Post subject: Reply with quote

Matt,

All the mathematical intrinsics are supported, unless you are using complex data types which we're still working on. Data statements are fine as well.

Quote:
Is that version of the driver too old for 10.2?
The 185 driver supports cards with compute capability 1.3 (Tesla, GTX280, etc) but is for CUDA 2.2. With the 10.2 CUDA Fortran, we use CUDA 2.3. I'll need to ask my contacts at NVIDIA to see it this is indeed a conflict. Just in case, you can download the latest NVIDIA drivers at http://www.nvidia.com/Download/Find.aspx?lang=en-us.

- Mat
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 306
Location: Greenbelt, MD

PostPosted: Wed Feb 24, 2010 11:19 am    Post subject: Reply with quote

mkcolg wrote:
Matt,

All the mathematical intrinsics are supported, unless you are using complex data types which we're still working on. Data statements are fine as well.

Welp, I'm sunk, then. I can't seem to figure it out. Since it's a big file, I'll send something to Technical Support rather than copy-paste it here.
Quote:
Quote:
Is that version of the driver too old for 10.2?
The 185 driver supports cards with compute capability 1.3 (Tesla, GTX280, etc) but is for CUDA 2.2. With the 10.2 CUDA Fortran, we use CUDA 2.3. I'll need to ask my contacts at NVIDIA to see it this is indeed a conflict. Just in case, you can download the latest NVIDIA drivers at http://www.nvidia.com/Download/Find.aspx?lang=en-us.

Yep, that did it. Needed CUDA 2.3 and the latest drivers. Thanks.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group