|
| View previous topic :: View next topic |
| Author |
Message |
Tuan
Joined: 11 Jun 2009 Posts: 226
|
Posted: Sun May 16, 2010 4:41 pm Post subject: APM PGI 10.5 - !$acc region |
|
|
I have a question.
| Code: | !$acc region copyin(Cs), copyout(Ds)
Ds = Cs
DO i = 1, n
DO j = 1, n
...
ENDDO
ENDDO
!$acc end region
|
My question is whether the statement Ds=Cs is performed on Accelerator or not. If not, should I do something like
| Code: | !$acc region copyin(Cs), copyout(Ds)
DO i = 1, n
Ds(i,:) = Cs(i,:)
DO j = 1, n
...
ENDDO
ENDDO
!$acc end region
|
Thanks,
Tuan |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Mon May 17, 2010 3:32 pm Post subject: |
|
|
Hi Tuan,
Since "Ds=Cs" is an implied do loop, it will be accelerated.
| Code: | % cat test.f90
program test
real, dimension(1024,1024) :: Ds, Cs
integer :: i,j,n
n = 1024
Cs = 0.231
!$acc region copyin(Cs), copyout(Ds)
Ds = Cs
DO i = 1, n
DO j = 1, n
Ds(i,j) = Ds(i,j) * (i+j)
ENDDO
ENDDO
!$acc end region
print *, Cs(1,1), Ds(1,1), Cs(1024,1024), Ds(1024,1024)
end program test
% pgf90 -ta=nvidia -Minfo=accel test.f90 -V10.5
test:
9, Generating copyin(cs(:,:))
Generating copyout(ds(:,:))
Generating compute capability 1.0 binary
Generating compute capability 1.3 binary
10, Loop is parallelizable <<<<< Implied Do loop for Ds=Cs
Accelerator kernel generated
10, !$acc do parallel, vector(16)
CC 1.0 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
CC 1.3 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
11, Loop is parallelizable
12, Loop is parallelizable
Accelerator kernel generated
11, !$acc do parallel, vector(16)
12, !$acc do parallel, vector(16)
CC 1.0 : 8 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
CC 1.3 : 8 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
|
Hope this helps,
Mat |
|
| Back to top |
|
 |
Tuan
Joined: 11 Jun 2009 Posts: 226
|
Posted: Tue May 18, 2010 8:41 am Post subject: |
|
|
Thanks, mat
I forgot to check the compiler's output
Tuan |
|
| Back to top |
|
 |
TheMatt
Joined: 06 Jul 2009 Posts: 263 Location: Greenbelt, MD
|
Posted: Tue May 18, 2010 8:46 am Post subject: |
|
|
Mat,
I know you knew this was coming: how can I get those nice cubin-like status messages out of pgfortran? |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Wed May 19, 2010 9:17 am Post subject: |
|
|
Hi Matt,
| Quote: |
CC 1.0 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy
CC 1.3 : 6 registers; 24 shared, 32 constant, 0 local memory bytes; 100 occupancy |
These are new in 10.5. We took your advice and added the output of "--ptxas-options=-v" to the "-Minfo=accel" messages.
Sorry, I should have updated your post to let you know.
Thanks,
Mat |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|