PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

device variable in module
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
RTLEE



Joined: 01 Mar 2006
Posts: 19

PostPosted: Tue Jan 04, 2011 2:48 pm    Post subject: device variable in module Reply with quote

Is it possible yet (with 11.0) to place allocatable device variables inside a module? I know this has been a pending feature for a while, but I still get an error when I execute the following code on osx86.

Code:

0: ALLOCATE: copyin Symbol Memcpy FAILED:13(invalid device symbol)


Thanks,
Todd

Code:

module cudamod
   implicit none

   integer, device, allocatable, dimension(:)  :: int_d

   contains
   attributes(global) subroutine foo
      int_d(threadidx%x) = threadidx%x
   end subroutine foo
end module cudamod

program fcuda
   use cudafor
   use cudamod
   implicit none

   integer :: int_h(16)

   int_h = 0

   allocate(int_d(16))
   call foo<<<1,16>>>
   
   int_h = int_d
   
   print *,'int_h = ',int_h
   
   deallocate(int_d)
end program fcuda
[/code]
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6129
Location: The Portland Group Inc.

PostPosted: Wed Jan 05, 2011 5:30 pm    Post subject: Reply with quote

Hi RTLEE,

Support for device module allocatables has been available since the 10.6 release and your code runs fine with the 10.6 through 10.9 versions of the compiler. This appears to be a new bug in 11.0 that only occurs on MacOSX. The code runs fine on Linux and Windows. I have send a report to our engineers for further investigation (TPR#17589)

Thanks for the report,
Mat

Code:
% pgf90 -V10.9 test.cuf -Mcuda
% a.out
 int_h =             1            2            3            4            5
            6            7            8            9           10           11
           12           13           14           15           16
% pgf90 -V11.0 test.cuf -Mcuda
% a.out
0: ALLOCATE: copyin Symbol Memcpy FAILED:13(invalid device symbol)
Back to top
View user's profile
Jason Kenney



Joined: 14 Jan 2011
Posts: 1

PostPosted: Tue Jan 25, 2011 5:09 pm    Post subject: Reply with quote

Hi,

I'm new to CUDA Fortran and have a related question/issue. I can compile and run the matmul.CUF example. I then try adding a module which defines an allocatable device array with no changes to the rest of the code to reference it:

Code:
module test_cuda
  use cudafor
  real, device, allocatable, dimension (:) :: testdev
end module test_cuda


The code compiles but when running, it simply exits with no output. If I drop the device attribute, i.e.:

Code:
module test_cuda
  use cudafor
  real, allocatable, dimension (:) :: testdev
end module test_cuda


The code compiles and runs as before (the added module is ignored, as I assumed it would have been earlier). What am I missing?

I'm running a trial license of PGI Workstation 11 under win32.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6129
Location: The Portland Group Inc.

PostPosted: Fri Jan 28, 2011 4:27 pm    Post subject: Reply with quote

Hi Jason,

Does your code access testdev or is the only change that you added the four lines of source?

Module device data is currently only accessible for routines within the same module that they are declared. The problem being that there isn't a linker for device code, hence to way to associate external symbols. We are working on adding this capability by essentially doing the link dynamically at runtime (See: http://www.pgroup.com/lit/articles/insider/v2n3a1.htm)

- Mat
Back to top
View user's profile
Frank Hansche



Joined: 27 May 2014
Posts: 2

PostPosted: Fri Apr 10, 2015 8:18 am    Post subject: Reply with quote

Hi,

I get the same error with allocatable device variables. I use version 14.10. My excutable works fine with NVidia Quadro K2000. But with NVidia K2200 I get

Code:

0: ALLOCATE: copyin Symbol Memcpy FAILED:13(invalid device symbol)


I tried the code submitted bei RTLEE with the same result. K2000 ok, K2200 gives an error.

Is this still a limitation, a bug or what did I wrong?

I used the following compiler flags

Code:

-tp=p7 -Mcuda -Mvect -Mquad -Mlre -lblas -Mcache_align -Mflushz -acc -Mbackslash -Minfo=all -Mpreprocess -Minline=


Thanks in advance

Frank
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group