PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Derived Type Array Problem
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
msgross42



Joined: 19 Feb 2013
Posts: 6

PostPosted: Wed Mar 06, 2013 11:34 am    Post subject: Derived Type Array Problem Reply with quote

I am trying to allocate a array of a derived type before passing it to my GPU kernel. I have type A defined as a host variable and type B as a device variable of dimension 2. If I use B(:)=A, my code compiles without problem, but when I run it, it crashes before executing the first line of code. I get a window that pops up saying the code has stopped working. If I run the code where type B is just a "scalar" type, then it works fine. I have also tried first making a type C that is also dimension 2 and allocated that by C(:)=A and then B=C. This gives me the same result. Is there a way to allocate an array of type B so all elements are the same as type A? If not, is there a way to pass type B as a local variable into each thread so each thread can execute using the type without interfering with the other threads?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Wed Mar 06, 2013 12:14 pm    Post subject: Reply with quote

Hi msgross42,

I'll need a reproducing example because "B(:)=A" should work. If it's too big to post, please send to PGI Customer Service (trs@pgroup.com) and ask them to forward it to me.

Thanks,
Mat
Back to top
View user's profile
msgross42



Joined: 19 Feb 2013
Posts: 6

PostPosted: Wed Mar 06, 2013 12:47 pm    Post subject: Reply with quote

Here is a test code that I made which is exhibiting the same problem.

Module Gpumodule
use cudafor
implicit none

real*8,constant :: dt
real*8,constant :: tfinal
real*8,constant :: a
real*8,constant :: b
type InitialConditions
real*8 :: x1(10) = 0.0
real*8 :: x2(10) =0.0
real*8 :: x3(10)= 0.0
end type InitialConditions
type Body
real*8 :: State(3) = 0.0
real*8 :: Statedot(3) = 0.0
type(InitialConditions) IC
end type Body



contains

attributes(global) subroutine simulation(BD,statefinal)
integer i,j,k,e,indx,npts
real*8 sum,nominaltime,nominalstate(3),rkalfa(4),krkbody(3,4),statefinal(10,*),time
type(Body)::BD(:)
indx=(blockidx%x-1) * blockdim%x + threadidx%x
! Define Constants

rkalfa(1) = 1.0; rkalfa(2) = 2.0; rkalfa(3) = 2.0; rkalfa(4) = 1.0

! Initial State Vector

BD(indx)%State(1)=BD(indx)%IC%x1(indx)
BD(indx)%State(2)=BD(indx)%IC%x2(indx)
BD(indx)%State(3)=BD(indx)%IC%x3(indx)
time=0


! Integrate Equations of Motion

npts = nint(tfinal/dt)

do i=1,npts


! Store Nominal State Values

nominaltime = time
nominalstate = BD(indx)%State


! Numerical Integration of Equations of Motion

do j=1,4

! State Values to Evaluate Derivatives

if (j .ne. 1) then
time = nominaltime + dt/rkalfa(j)
do k=1,3
BD(indx)%State(k) = nominalstate(k) + krkbody(k,j-1)/rkalfa(j)
end do
end if

! Compute Derivatives

call deriv(BD(indx))
do k=1,3
krkbody(k,j) = dt*BD(indx)%Statedot(k)
end do

end do

! Step Time

time = nominaltime + dt

! Step States

do j=1,3
sum = 0.0
do k=1,4
sum = sum + rkalfa(k)*krkbody(j,k)
end do
BD(indx)%State(j) = nominalstate(j) + sum/6.0
end do
end do
statefinal(indx,1)=BD(indx)%State(1)
statefinal(indx,2)=BD(indx)%State(2)
statefinal(indx,3)=BD(indx)%State(3)
end subroutine simulation

attributes(device) subroutine deriv(BD)
type(Body) BD
BD%Statedot(1)=BD%State(2)
BD%Statedot(2)=BD%State(3)
BD%Statedot(3)=BD%State(1)*a+BD%State(2)*b
return
end subroutine deriv
end Module GPUmodule

Program Main
use cudafor
use GPUmodule
implicit none
type(Body)::BDH
type(Body),device::BDD(10)
real*8 statefinalh(10,3)
real*8,device :: statefinald(10,3)
integer i,istat

write(*,*) 'start'
a=2
b=5
dt=0.01
tfinal=5

do i=1,10
BDH%IC%x1(i)=i
BDH%IC%x2(i)=i
BDH%IC%x3(i)=i
end do

BDD(:)=BDH

call simulation<<<1,10>>>(BDD,statefinald)
istat=cudaDeviceSynchronize()
statefinalh=statefinald
do i=1,10
write(*,*) statefinalh(i,:)
end do
end Program Main
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Wed Mar 06, 2013 1:30 pm    Post subject: Reply with quote

Ok, I thought meant "B" was just a two element array and "A" was a scalar.

Code:
BDD(:)=BDH
A UDT to UDT requires a deep copy. The error is because BDD's address need to be deferenced and since it's a device pointer, a segv occurs.


Though in looking at your code, I'm wondering if these UDT's are necessary. It seems you just need to pass in the initial conditions and then use local scalars to hold the thread's states. Using local scalars will help performance since these can be held in the register file rather then being fetch from global memory.

- Mat
Back to top
View user's profile
msgross42



Joined: 19 Feb 2013
Posts: 6

PostPosted: Wed Mar 06, 2013 1:38 pm    Post subject: Reply with quote

The reason it is set up like this is to try and mimic how my actual code is set up. I am trying to adapt an existing code to be used on a GPU and I am trying to limit the number of significant modifications. Is there a way to define another derived type within the GPU kernel that exists only in local memory as opposed to continually referencing the type that is passed into global memory in the function call?
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group