PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

How can I make -Mvect=sse and -mp work togehter?

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
tangoman



Joined: 23 Aug 2007
Posts: 6

PostPosted: Thu Aug 23, 2007 2:33 am    Post subject: How can I make -Mvect=sse and -mp work togehter? Reply with quote

Hi all,

I am trying to tune a numerical computing program with openmp on multi core AMD machine. I found the program with -mp option is much slower than the one without -mp when it runs with one thread. I post a simple test as following:

Code:

!$OMP PARALLEL
!$OMP DO PRIVATE(i,j,k)
      do i=1,nx
         do j=1,ny
            do k=1,nz
               tmp = c0*(a(k-4,j,i)+a(k+4,j,i))
     &             + c1*(a(k-3,j,i)+a(k+3,j,i))
     &             + c2*(a(k-2,j,i)+a(k+2,j,i))
     &             + c3*(a(k-1,j,i)+a(k+1,j,i))
     &             + c4*a(k,j,i)
               b(k,j,i) = b(k,j,i)+c5*tmp
            enddo
         enddo
      enddo
!$OMP END PARALLEL

I use –Minfo option to display compile-time optimization listings. It seems that the option -Mvect=sse conflits with -mp. The defference shows as following:

pgf90 -tp k8-64 -fastsse -Minfo -Mneginfo  -c -o test.o test.f
my_test:
    19, Generated 3 alternate loops for the inner loop
        Generated vector sse code for inner loop
        Generated 2 prefetch instructions for this loop
        Generated vector sse code for inner loop
        Generated 2 prefetch instructions for this loop
        Generated vector sse code for inner loop
        Generated 2 prefetch instructions for this loop
        Generated vector sse code for inner loop
        Generated 2 prefetch instructions for this loop

pgf90 -tp k8-64 -fastsse -mp -Minfo -Mneginfo  -c -o test.o test.f
my_test:
    15, Parallel region activated
    17, Parallel loop activated; static block iteration allocation
    19, Unrolled inner loop 8 times
        Generated 2 prefetch instructions for this loop
    29, Barrier
        Parallel region terminated

How can I make them work togehter? Any suggestion is welcome.

Thanks!

 
Back to top
View user's profile
brentl



Joined: 20 Jul 2004
Posts: 132

PostPosted: Fri Aug 24, 2007 5:44 pm    Post subject: Reply with quote

You might need to declare tmp to be private.
Back to top
View user's profile
tangoman



Joined: 23 Aug 2007
Posts: 6

PostPosted: Sun Aug 26, 2007 10:56 pm    Post subject: Reply with quote

Yes, I made a mistake here. Thanks, brentl.

After I declared tmp as private, the optimization information is still a little different from the one without -mp flag.

15, Parallel region activated
17, Parallel loop activated; static block iteration allocation
19, Generated an alternate loop for the inner loop
Generated vector sse code for inner loop
Generated 2 prefetch instructions for this loop
Generated vector sse code for inner loop
Generated 2 prefetch instructions for this loop
29, Barrier
Parallel region terminated

Any suggestion?
Back to top
View user's profile
brentl



Joined: 20 Jul 2004
Posts: 132

PostPosted: Thu Sep 13, 2007 11:02 am    Post subject: Reply with quote

Our altcode generator makes decisions based on a number of factors, being in a parallel region among them. That is why the differences. If you find it makes a big performance difference, you should let us know. Since the code vectorizes in both cases now, the code should be running fairly well.
Back to top
View user's profile
tangoman



Joined: 23 Aug 2007
Posts: 6

PostPosted: Wed Sep 19, 2007 12:32 am    Post subject: Reply with quote

Thanks, these two version run almost at the same speed.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group