PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

accelerator strange beahviour
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
franzisko



Joined: 11 Jan 2011
Posts: 25

PostPosted: Tue Jul 31, 2012 2:04 am    Post subject: accelerator strange beahviour Reply with quote

Hello,
I have a problem accelerating a code. I managed to reproduce the strange behaviour:

Code:
program strange_behaviour
implicit none
integer, parameter :: nx=4,ny=4,nz=2
integer :: i,j,k
real, dimension(nx,ny,nz) :: a,b

a=10
call random_number(b)

!$acc kernels
do j=1,ny
do i=1,nx
!$acc do seq
   do k=1,nz
      a(i,j,k) = 3.
   enddo

!  IT DOES NOT WORK ON GPU
   b(i,j,1) = a(i,j,1)

!  WORKAROUND
!!$acc do seq
!   do k=1,1
!   b(i,j,k) = a(i,j,k)
!   enddo

enddo
enddo
!$acc end kernels

print*,'b: ',b
end program strange_behaviour


I can fix the problem using the given workaround but It would be very appreciated to avoid to use it, if possibile. Is there anything I do not understand? I am using PGI 12.6

thanks for any help
Francesco
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Tue Jul 31, 2012 4:51 pm    Post subject: Reply with quote

Hi Francesco,

Can you please give more details on the problem you are seeing? When I run your code with and without -acc, I get the same answers.

Code:
% pgf90 strange.f90 -Minfo -V12.6
% a.out
 b:     3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
   0.6280164       0.6701866       0.6281718       0.5310344     
   0.5005002       0.4253310       0.3070166       0.2169546     
   0.6901000       0.8211479       0.8735071       0.9649668     
   0.8245004       0.4523637       0.2586277       0.3373762   
% pgf90 strange.f90 -Minfo -V12.6 -acc
strange_behaviour:
      7, Memory set idiom, array assignment replaced by call to pgf90_mset4
     10, Generating copyout(b(:,:,:1))
         Generating copyout(a(:,:,:))
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary
     11, Loop is parallelizable
     12, Loop is parallelizable
         Accelerator kernel generated
         11, !$acc loop gang ! blockidx%y
         12, !$acc loop gang, vector(32) ! blockidx%x threadidx%x
             CC 1.0 : 10 registers; 32 shared, 4 constant, 0 local memory bytes
             CC 2.0 : 13 registers; 0 shared, 48 constant, 0 local memory bytes
     14, Loop is parallelizable
% a.out
 b:     3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
   0.6280164       0.6701866       0.6281718       0.5310344     
   0.5005002       0.4253310       0.3070166       0.2169546     
   0.6901000       0.8211479       0.8735071       0.9649668     
   0.8245004       0.4523637       0.2586277       0.3373762   


- Mat
Back to top
View user's profile
franzisko



Joined: 11 Jan 2011
Posts: 25

PostPosted: Wed Aug 01, 2012 3:55 am    Post subject: Reply with quote

Hi Mat,

I run exactly as you do but the results are not correct. My device is Tesla S2050 and unfortunately I cannot test any other device easily.

Code:

% pgf90 strange.f90 -Minfo -V12.6
ella008:~/w/COSMO/CONVSTN_LUGLIO/TEST/PGI_BUG>./a.out
 b:     3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
    3.000000        3.000000        3.000000        3.000000     
   0.6280164       0.6701866       0.6281718       0.5310344     
   0.5005002       0.4253310       0.3070166       0.2169546     
   0.6901000       0.8211479       0.8735071       0.9649668     
   0.8245004       0.4523637       0.2586277       0.3373762   
% pgf90 strange.f90 -Minfo -V12.6 -acc
strange_behaviour:
      7, Memory set idiom, array assignment replaced by call to pgf90_mset4
     10, Generating copyout(b(:,:,:1))
         Generating copyout(a(:,:,:))
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary
     11, Loop is parallelizable
     12, Loop is parallelizable
         Accelerator kernel generated
         11, !$acc loop gang ! blockidx%y
         12, !$acc loop gang, vector(32) ! blockidx%x threadidx%x
             CC 1.0 : 10 registers; 32 shared, 4 constant, 0 local memory bytes
             CC 2.0 : 13 registers; 0 shared, 48 constant, 0 local memory bytes
     14, Loop is parallelizable

%./a.out
 b:   -1.9983972E+18  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18
  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18
  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18
  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18  -1.9983972E+18
   0.6280164       0.6701866       0.6281718       0.5310344     
   0.5005002       0.4253310       0.3070166       0.2169546     
   0.6901000       0.8211479       0.8735071       0.9649668     
   0.8245004       0.4523637       0.2586277       0.3373762   


The code works using PGI 12.5.

thanks
Francesco
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Wed Aug 01, 2012 2:27 pm    Post subject: Reply with quote

Hi Francesco,

What CUDA Driver version do you have? (see the output from pgaccelinfo) In 12.6 we switched to using CUDA 4.2 by default so if your driver is old it may cause problems.

You can also try switching back to using CUDA 4.0 by compiling with "-ta=nvidia,cuda4.0".

- Mat
Back to top
View user's profile
franzisko



Joined: 11 Jan 2011
Posts: 25

PostPosted: Thu Aug 02, 2012 7:35 am    Post subject: Reply with quote

Hi Mat,

thanks for attention,

out CUDA driver is 4.1. However, trying to compile using cuda4.0 or cuda4.1 the error is still the same. Anything is always fine using PGI V12.5.

Something (maybe similar) happens in the big code I am working at the moment.

Francesco
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group