PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

PGI FORTRAN OpenMP: poor performance in a big loop???
Goto page 1, 2, 3, 4  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Nick Kong



Joined: 08 Jun 2012
Posts: 11

PostPosted: Fri Jun 08, 2012 8:36 am    Post subject: PGI FORTRAN OpenMP: poor performance in a big loop??? Reply with quote

Hello, All,

I am a PGI FORTRAN (2011) OpenMP user. Currently I am trying to implement parallel computing with OpenMP in my FORTRAN program. If I run the parallel computing isolately, it is faster than the sequential computing. But if this parallel computing is in a big looping (therefore, the multiple threads are created/cloased on each loop), this parallel computing is much slower than the sequential computing. Do you have the similar experience? The big loop outside this parallel computing is needed to keep the computing completed (also, this big loop is almost impossible to be implemented with parallel computing). Any solution for this?

Thank you!
Nick
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6206
Location: The Portland Group Inc.

PostPosted: Fri Jun 08, 2012 9:06 am    Post subject: Reply with quote

Hi Nick,

There could be any number of reasons why this is happening. Synchronization issues, memory, coding error, etc. Without an example, we don't really have anyway of knowing.

What I'd suggest doing is profiling your code using 'pgcollect' and then reviewing the resulting profile in PGPROF. (details on how to profile can be found in the PGPROF User's Guide). This will give a better idea of where the time is being spent. In particular, look for the "mp_barrier" routine. My best guess at this point is that your threads get stuck waiting for each other.

Also, if you could post an basic example code of what you are doing, including the OpenMP directives, that may help.

Best Regards,
Mat
Back to top
View user's profile
Nick Kong



Joined: 08 Jun 2012
Posts: 11

PostPosted: Fri Jun 08, 2012 1:40 pm    Post subject: Reply with quote

Hi, Mat,

Thank you for your help!

This is the example codes:

CALL omp_set_num_threads(10)

DO J = 1, 1000
!$OMP PARALLEL
!$OMP DO PRIVATE(I)
DO I = 1,10
!$OMP TASK
CALL MY_SUB()
!$OMP END TASK
END DO
!$OMP END DO
!$OMP END PARALLEL
END DO

SUBROUTINE MY_SUB()
CALL SLEEP(0.01)
END MY_SUB


Inside the "DO" loop, the parallel computing become slower than sequential computing.

Thank you again.
Nick
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6206
Location: The Portland Group Inc.

PostPosted: Fri Jun 08, 2012 2:11 pm    Post subject: Reply with quote

Hi Nick,

Your program is hanging in the SLEEP routine since ti expects an integer but you're passing in a float. To fix, change "0.01" to "1".

Hope this helps,
Mat
Back to top
View user's profile
Nick Kong



Joined: 08 Jun 2012
Posts: 11

PostPosted: Sat Jun 09, 2012 8:40 am    Post subject: Reply with quote

Hi, Mat,

The current sample may not be well reproduced (I did this in rush at the weekend). I will reproduce this sample codes again when I come back to my office June 11. Thank you!!!

Nick
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group