PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Dynamic global memory allocation

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
uestc0626



Joined: 14 May 2012
Posts: 6

PostPosted: Wed Sep 25, 2013 9:51 pm    Post subject: Dynamic global memory allocation Reply with quote

Hi guys, I want to test dynamic global memory allocation. the code is that:
Code:

module a_test
  integer,device,allocatable :: b(:)
  contains
    attributes(global) subroutine kernel(a,n)
    integer,value   :: n
        integer,device  :: a(n)
        integer  :: i
        !integer,device,allocatable :: b(:)
        if(threadidx%x==1) then
           allocate(b(n))
        endif
        call  syncthreads()
          i=threadIdx%x
        b(i)=i
        a(i)=b(i)
     end subroutine
end module

program prog
   use a_test
   use cudafor
   implicit none

   integer,parameter  :: n=128
   integer,device     :: a_d(n)
   integer  :: a(n)

   call kernel<<<1,128>>>(a_d,n)
   a=a_d
 
   write(*,*)a(10:20)
end program


I compile the code ,using PVF13.7. ''-Mcuda" value is "-Mcuda=cc3.05,cuda5.0,rdc" But there are some problems.
Code:

C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s: Assembler messages:
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:4: Warning: value 0xffed truncated to 0xed
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:4: Warning: value 0xffba truncated to 0xba
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:4: Warning: value 0xff98 truncated to 0x98
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:5: Warning: value 0xff80 truncated to 0x80
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:6: Warning: value 0xfff6 truncated to 0xf6
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:13: Warning: value 0xff80 truncated to 0x80
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:15: Warning: value 0xfff3 truncated to 0xf3
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:22: Warning: value 0xff80 truncated to 0x80
C:\Users\Administrator\AppData\Local\Temp\pgnvd3bzu6blDNefEUF.s:23: Warning: value 0xfff1 truncated to 0xf1
pgacclnk: spawnv failed: C:\Users\Administrator\AppData\Local\Temp\pgcudafat2d54bcRJcHcR3D.o

I compile the code under CentOS using PGI Workstation, that is OK. But under window OS, I can't compile it successfully. I don't know why. Please help me
ps: Is thera any tool to debug CUDA Fortran code?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Fri Sep 27, 2013 9:56 am    Post subject: Reply with quote

Hi uestc0626,

Apologies for the late response. I was getting input from engineering an wanting to test the released 13.9 (using CUDA 5.5) before responding. Unfortunately, I'm still seeing various errors when building on Windows with either CUDA 5.0 or 5.5. Hence, I added a problem report (TPR#19604) and sent it on to engineering. There is no work around at this point.

What is your goal with this example? Are you just testing device allocation or do you really want "b" to be a shared array with a length of "n"? If it's the latter, then the better way to do this is to use a shared automatic array and then pass in the size of the array as the third argument in the kernel launch.

For example:
Code:
module a_test

  contains

    attributes(global) subroutine kernel(a,n)

      implicit none
      integer,value   :: n
      integer,device  :: a(n)
      integer :: i
      integer,shared :: b(n)


     call  syncthreads()

      i=threadIdx%x
      b(i)=i
      a(i)=b(i)
     return
    end subroutine kernel

  end module

  program prog

     use a_test
     use cudafor
     implicit none

     integer,parameter  :: n=128
     integer,device     :: a_d(n)
     integer                   :: a(n)

     call kernel<<<1,128,n*4>>>(a_d,n)
     a=a_d

     write(*,*)a(10:20)

  end program


- Mat
Back to top
View user's profile
jtull



Joined: 30 Jun 2004
Posts: 445

PostPosted: Thu Jul 24, 2014 5:15 pm    Post subject: TPR 19604 - UF: CUDA Fortran device allocation fails to comp Reply with quote

TPR 19604 - UF: CUDA Fortran device allocation fails to compile on Windows

should now work in the 14.6 and current 14.7 release.

regards,
dave
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group