PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Is there a way to vectorize this routine?
Goto page Previous  1, 2
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
tangoman



Joined: 23 Aug 2007
Posts: 6

PostPosted: Fri Oct 05, 2007 12:15 am    Post subject: Reply with quote

MattD,
Thanks. I used PGI 6.1 on SuSE 9.2 x86-64 Linux.
% pgf90 -V

pgf90 6.1-2 64-bit target on x86-64 Linux
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2005, STMicroelectronics, Inc. All Rights Reserved.

The follwing test code failed to vectorize.
Code:

      PROGRAM TEST
      IMPLICIT NONE

      integer nx,ny,nz
      parameter(nx=128)
      parameter(ny=128)
      parameter(nz=128)

      complex a,b,sum
      dimension a(nz,ny,nx)
      dimension b(nz,ny,nx)
      integer i,j,k

      do i=1,nx
         do j=1,ny
            do k=1,nz
               sum = sum+a(k,j,i)*b(k,j,i)
            enddo
         enddo
      enddo

      return
      end


% pgf90 -O2 -Mvect=sse -Minfo test.f
test:
16, Unrolled inner loop 8 times
Generated 1 prefetch instructions for this loop

Any suggestion? Thanks!
Back to top
View user's profile
MattD



Joined: 20 Aug 2007
Posts: 6
Location: Greenville, TX

PostPosted: Tue Oct 09, 2007 11:11 am    Post subject: Reply with quote

tangoman,

I tried the code you posted, and I get the same optimization report that you listed (no vectorization).

However, if I change COMPLEX to REAL (for a, b, and sum), it does vectorize.

So it seems that the compiler does not auto-vectorize complex math in this example. I can't think offhand why it wouldn't, though.

You could vectorize it yourself if the performance was really critical. It shouldn't be that tough for this example (as opposed to the problem you presented in the first post in this thread).

You may also get better performance from using SUM rather than loops. I haven't timed it for this example, but it's usually better optomized:

sum1 = sum1 + sum(sum(sum(a*b,1),1),1)

Also, you must rename your `sum' variable if you want to use the SUM function. Array operations are nice. Use them when possible. (They also tend to have a better chance of auto-vectorizing than explicitly written loops.)

Good luck,

-Matt
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group