PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Slowness by pointer transfer

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
svend



Joined: 12 Jan 2006
Posts: 3

PostPosted: Thu Jan 12, 2006 8:36 am    Post subject: Slowness by pointer transfer Reply with quote

Hi,

I use pgf90 6.0-2 32-bit target on x86 Linux and I have a program with calls of the type

Code:
call foo(event%time)


Here, the subroutine foo is defined in a module, and event is a table of pointers:

Code:

  type :: event_properties
    real :: time
!...
  end type event_properties
  type (event_properties), allocatable, dimension(:) :: event


I have compiled the code both using pgf90 and ifort, using both optimization and debug flags. The pgf90-generated executable runs about four times slower than the one made by ifort!

Have anyone experienced this, and do you know what I can do to speed up pgf90? I have looked in the manual pages, but did not find any flags that helped.

Regards,
Svend
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6213
Location: The Portland Group Inc.

PostPosted: Thu Jan 12, 2006 10:35 am    Post subject: Reply with quote

Hi Svend,

I'm unaware of any performance problems related to pointer transfer so don't have any good suggestions for you. Would to be possible to get a copy of this code so we can investigate what's going wrong? If you could post a link here or send the information to trs@pgroup.com I would appreciate it. Also please post more detail about what optimizations you have tried and what OS your using.

Thanks,
Mat
Back to top
View user's profile
svend



Joined: 12 Jan 2006
Posts: 3

PostPosted: Fri Jan 13, 2006 1:05 am    Post subject: Reply with quote

Hi Mat,

Thanks for your reply. My OS is SuSE Linux 9.3 with kernel 2.6.11.4-21.10-default (but I have also tried other Linuces).

Here's the code: http://www.pvv.org/~stm/pointer-trouble.f90

It's just a 50-line test program. It doesn't do anything sensible, but it's stripped down from a "real-life" example. There's a type declaration with reals, integers, and characters, which are not all used in the test program. I have noticed that the program runs faster if I strip them away, but I need them in the real-life code...

I compiled the code using pgf90 -fast -r8, and ifort -O3 -r8. On my computer, the pgf90-code took 5.7 seconds, while ifort took 1.4 seconds. The CPU-times are similar also without optimization flags.

Thanks,
Svend
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6213
Location: The Portland Group Inc.

PostPosted: Fri Jan 13, 2006 4:17 pm    Post subject: Reply with quote

Hi Svend,

I've passed your code on to our compiler team since it looks like a bug to me. What's happening is that before the call to "foo" the compiler needs to create a temproray array containing the values of "event%t_last_collision". This is correct and expected. After the call, we gather up the results and put them back into event. However, since the "time" variable in foo is declared "intent(in)", we don't need to do the extra step of saving the results.

Note that you could dramatically speed up your code (with both PGI and Intel) by using a temproary array to store the values of t_last_collision since either compiler must create it's own temporary array before entry into foo for each iteration of the loop.

Example:
Code:
  allocate(tmp2(maxpart))
  tmp2 = event%t_last_collision
  do i=1,10000
!    call foo(event%t_last_collision,maxpart,count)
    call foo(tmp2,maxpart,count)
  end do


Once I know more, I'll post it here. We appreciate you call this to our attension.

Thanks,
Mat
Back to top
View user's profile
svend



Joined: 12 Jan 2006
Posts: 3

PostPosted: Fri Jan 27, 2006 1:32 am    Post subject: Reply with quote

Hi Mat,

Thank you for your answer (and sorry for my late one). What we found out was similar to your advice, and I ended up using a normal array instead of "event%t_last_collision", gaining a lot of speed (similar to switching from -g to -fast).

I'd be interested in knowing any further developments, though.

Regards,
Svend
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group