PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Loop not vectorized: mixed data types
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Performance and Benchmarking
View previous topic :: View next topic  
Author Message
Zhenya



Joined: 15 Jan 2011
Posts: 4

PostPosted: Sat Jan 15, 2011 3:41 pm    Post subject: Loop not vectorized: mixed data types Reply with quote

I'm trying to optimize a Fortran 90 code, from which I'm trying to squeeze last drops of performance :-).

Compiling the code using
pgf90 -fastsse -Minline=levels:2 -Minfo=all ...
there is the message in the log:
Quote:
924, Loop not vectorized: mixed data types
926, greenfun inlined, size=3, file mf__jan15_bilayer.f (2318)
2323, greenfun_2 inlined, size=23, file mf__jan15_bilayer.f (2332)


The line numbers point to the following part of the code (I restore the line numbers by hand):
Code:
924      do j=1,pm ; vova = nm_clmn(j)
925         sv = ksite(vova); tv = ktau(vova)
926          m_v2(j,2) = GREENFUN(sv,tv,site,tnew)
927      enddo

Here nm_clmn(:) and ksite(:) is a (rather large) integer arrays, and ktau(:) and m_v2(:,1:2) are real*8 arrays. GREENFUN is a routine which gets inlined, as far as I understand, it's basically a spline interpolation of a pre-tabulated array.

The question is then --- what exactly prevents the compiler from vectorizing the loop, and if there is a way to fix it?

Any suggestions would be gratefully appreciated.

Zhenya
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Mon Jan 17, 2011 9:14 am    Post subject: Reply with quote

Hi Zhenya,
Quote:

924, Loop not vectorized: mixed data types
This means that the data types on the left and right hand side are different, hence preventing vectorization.

What data type does GREENFUN return? I'm assuming real*4, in which, this needs to be changed to real*8, or m_v2 needs to be real*4.

Hope this helps,
Mat
Back to top
View user's profile
Zhenya



Joined: 15 Jan 2011
Posts: 4

PostPosted: Mon Jan 17, 2011 11:36 am    Post subject: Reply with quote

Hi Mat,

As a rule of thumb I try to avoid mixing real*4 and real*8 by just using real*8 always.

Double-checked the code once again, and no, there are only integer-s (no kind specified) and real*8-s. And all the assignments in this code block are either integer-to-integer, or real*8-to-real*8.

That's a little puzzling, at least for me.

Zhenya
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Tue Jan 18, 2011 3:46 pm    Post subject: Reply with quote

Hi Zhenya,

I'm not sure then. Can you post are reproducing example and/or the source from GREENFUN?

Thanks,
Mat
Back to top
View user's profile
Zhenya



Joined: 15 Jan 2011
Posts: 4

PostPosted: Fri Jan 21, 2011 5:52 am    Post subject: Reply with quote

Hi Mat,

Here's the code which shows this glitch. The code is a copy-paste of relevant parts of the real code, and this snipped is *not* supposed to run -- the allocatable arrays are not allocated, variables declared but not initialized etc.

I'm compiling it with
Code:

pgf90 -Mfree -fastsse -Minfo=all -Minline=level:2 loopnotectorized.f -lacml


and the compilation log says:

Code:

MAIN:
     31, Loop not vectorized: mixed data types
     33, greenfun inlined, size=3, file loopnotectorized.f (42)
          47, greenfun_2 inlined, size=23, file loopnotectorized.f (56)
greenfun:
     47, greenfun_2 inlined, size=23, file loopnotectorized.f (56)


The loop in question is marked with a comment !@#$%

Code:

      implicit none
! globals
   real*8, allocatable :: m_v2(:,:)
        integer             :: pm,lda      ! actual size & leading dimension
      real*8, allocatable :: GR_DAT_2(:,:,:), GRD_DAT_2(:,:,:)   ! cf ine TABU_2
   real*8 :: beta       ! inverse temperature
   integer :: ntab                   ! Actual Number of sites per dimension for tabulation
   integer :: mtau
   real*8 :: bmt, bmt1     ! a shorthand for beta/mtau, and its inverse

! from teh lattice module
     integer  :: Nsite, Ncell          ! # of sites, # of unit cells
   integer, allocatable   :: ksite(:)      ! ksite(name) => site of a kink 'name'
   real*8, allocatable    :: ktau(:)       ! ktau(name) => tau of a kink 'name'
   integer, allocatable   :: nm_row(:),nm_clmn(:)  ! nm_row(row) => name of the kink associated with the row

! from add_2_same
   integer :: site,j,nk,vova,sv
   real*8  :: tnew, tv, tnew2

!----------------
   lda=128;
   allocate(m_v2(lda,2))

     
 !@#$% -----------
      do j=1,pm ; vova = nm_clmn(j)
         sv = ksite(vova); tv = ktau(vova)
      m_v2(j,1) = GREENFUN(sv,tv,site,tnew2)
      enddo


      contains


!----------------------------------------------------
! Green Function, selector
!----------------------------------------------------
      real*8 function GREENFUN(site1,tau1,site2,tau2)
      implicit none
      integer, intent(in) :: site1, site2
      real*8, intent(in)  :: tau1, tau2

      GREENFUN = GREENFUN_2(site1,tau1,site2,tau2)
!      GREENFUN = GREENFUN_1(site1,tau1,site2,tau2)     

      end function GREENFUN


!----------------------------------------------
!---  Green Function, spline interpolation of GR_DAT_2
!----------------------------------------------
      real*8 function GREENFUN_2(site1,tau1,site2,tau2)
      implicit none
      integer :: site1,site2,j, sgn
      real*8 :: tau, tau1, tau2, dt, gre

      integer :: nx, ny, nz, nta  !, ntb
      real*8 :: tta,ttb,ga,gb,c, gra,grb   !,p

! prepare \tau
      tau=tau1-tau2
      dt=tau; sgn=1

   if(tau < 1.d-14)then; dt=beta+tau; sgn=-1; endif
! Explanation: G(t=0) must be understood as G(t-> -0) = -G(t=\beta)
! A long way to accomplish this is below, commented out. A short way is above :).
!----------------------------------------

!----------------------------------- spline
   nta=dt*bmt1 !*p

      tta=dt-nta*bmt
   ttb=tta - bmt     !dt-ntb*(beta/mtau)
!cccccccccccccccccccccccccccccccccccccc
     
   ga=GR_DAT_2(nta,site1,site2)
   gb=GR_DAT_2(nta+1,site1,site2)

   gra=GRD_DAT_2(nta,site1,site2)
   grb=GRD_DAT_2(nta+1,site1,site2)

      c=(ga-gb)*bmt1

      gre=(c+gra)*ttb + (c+grb)*tta
      gre=gre*tta*ttb*bmt1 + gb*tta-ga*ttb
      gre=gre*bmt1


   GREENFUN_2 = gre*sgn


      end function GREENFUN_2




!-------------------------------------
!     Tabulates Green function and its time derivate at positive tau : ALL-NUMERICAL,
!        taken from ../disord/fermi-hubbard-disord.f
!-------------------------------------
      subroutine TABU_2
      implicit none
      real*8, allocatable :: ham(:,:)
      integer :: site,site1,j
   real*8 :: factor, ww,ttt,term, gamma, expet(0:mtau)
   integer :: nt

   ! lapack stuff
   character*1 :: jobz,uplo
   integer     :: ldh, lwork,info
   real*8, allocatable  :: work(:), eps(:)

   integer :: site2, i_x1(3), i_x2(3), n1,n2

   print*,' TABU_2'


      if(allocated(GR_DAT_2)) deallocate(GR_DAT_2)
      if(allocated(GRD_DAT_2)) deallocate(GRD_DAT_2)

   allocate( GR_DAT_2(0:mtau+1,1:Nsite,1:Nsite), GRD_DAT_2(0:mtau+1,1:Nsite,1:Nsite) )


! build the hamiltonian
   allocate(ham(1:Nsite,1:Nsite)) ;
   ham=0.d0
!------------------------------- commented out for the loopnotectorized ONLY
!   do site=1,Nsite
!      do j=1,coord_nbr(site); site1=neighb(j,site);
!         ham(site,site1)=ham(site,site1)-hop_int(j,site) 
!      enddo
!         if(site_layer(site)==1)then; ham(site,site) = -Vlayer
!         else;                        ham(site,site) =  Vlayer
!         endif
!   enddo;   

!  compute eigenvalues; for LAPACK parameters and arguments, see
!  http://www.netlib.org/lapack/double/dsyev.f
!  SUBROUTINE DSYEV( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, INFO )

   jobz='V'  ! compute eigenvectorz
   uplo='U'  ! use upper diag of ham(:,:) --- doesn't matter really
   ldh=Nsite
   lwork=12*Nsite
   allocate( work(lwork), eps(Nsite) )

! query the optimal workspace size
   call dsyev(jobz,uplo,Nsite,ham,ldh,eps,work,-1,info)
   lwork=work(1)
   deallocate(work); allocate(work(lwork))

! diagonalize
   call dsyev(jobz,uplo,Nsite,ham,ldh,eps,work,lwork,info)

   if(info/=0)then; print*,'*** dsyev returns info = ', info
          print*,'*** check the TABULATE routine'
          call mystop
   endif

!------------- have the spectrum, proceed to the GFs
   GR_DAT_2=0.d0; GRD_DAT_2=0.d0   

   do j=1,Nsite

     gamma=-eps(j)*beta
     gamma=exp(gamma)+1.d0
     gamma=-1.d0/gamma

     ww = exp(-eps(j)*bmt) ! bmt=beta/mtau
     do nt=0,mtau; expet(nt)=ww**nt
     enddo

          do site=1,Nsite ; do site1=1,Nsite
       factor = ham(site,j)*ham(site1,j)
            do nt=0,mtau
               term = factor*expet(nt)*gamma    !/Nsite
               GR_DAT_2(nt,site,site1) = GR_DAT_2(nt,site,site1) + term
               GRD_DAT_2(nt,site,site1) = GRD_DAT_2(nt,site,site1) -eps(j)*term
            enddo

          enddo; enddo ! site, site1

   enddo   ! j: eigenvalues
!------------------------


! fill in fictitious nt=mtau+1, see GREENFUN for explanation
   GR_DAT_2(mtau+1,:,:)=0.d0; GRD_DAT_2(mtau+1,:,:)=0.d0

      end subroutine TABU_2
!++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

   subroutine mystop
!!! free the memory etc
   stop
   end subroutine mystop

      end



Do you spot anything suspicious?

Thanks,
Zhenya
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Performance and Benchmarking All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group