PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Course

application not accelerating with openACC
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
efblack2



Joined: 28 Sep 2015
Posts: 11

PostPosted: Wed Jun 01, 2016 9:07 am    Post subject: application not accelerating with openACC Reply with quote

Hi all,

I have this 3D fluid dynamics application I have been working with trying to accelerate.
I am using:
pgcc 16.4-0 64-bit target on x86-64 Linux -tp haswell

I wanted to create a serial version first.
Initially I compiled the application with the -fast -Mnovect options to have a base to compare with. Then I used -fast -Mvect=simd and got a speed up of 1.56.

My next step was to use openMP. I got a speed up of about 4 with 4 threads with respect to the serial version without vectorization.

Now I am trying to see how much speedup I can get using my GPU, which are a couple of GeForce GT 750M with 2048 MB total memory each.

However the best I can get is about 1.5 slower than the serial version without vectorization, which is very disappointed. Profiling the application shows that data traffic is kept to a minimum. I am beginning to think that my GPU are not powerful enough for this application despite the fact than for other application it has performed very acceptable. Unfortunately I do not have access to other GPUs + PGI compilers to try.

I wonder if someone would like to give it a try. I have the code files, a makefile and a data file.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6693
Location: The Portland Group Inc.

PostPosted: Wed Jun 01, 2016 9:18 am    Post subject: Reply with quote

Hi efblack2,

Sure, I can take a look. Please either post a link to your source or send the package to PGI Customer Service (trs@pgroup.com) and ask them to forward it to me. Please include all versions (OpenACC, OpenMP, etc.) and work loads.

- Mat
Back to top
View user's profile
efblack2



Joined: 28 Sep 2015
Posts: 11

PostPosted: Wed Jun 01, 2016 9:50 am    Post subject: Reply with quote

Hi Mat,

Thanks for the quick response,

I just sent a file (numFlDyn3d.tar.gz) to trs@pgroup.com.

Inside, there is a data file called lowRes.txt. To run the program just provide the name of this data file as a command parameter.

Also inside the tar file you will find 3 makefiles (Makefile1, Makefile2, Makefile3) and a soft link to a Makefile.

Makefile1 will create the serial version w/o any vectorization.
Makefile2 will create the serial version with vectorization.
Makefile3 will create the openACC version. This is the version I am having problem with. I am providing makefiles for the serial versions only for reference.

I hope you have all the required information to run the programs.


Thanks again,

Edgar Black
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6693
Location: The Portland Group Inc.

PostPosted: Fri Jun 03, 2016 2:43 pm    Post subject: Reply with quote

Hi Edgar,

I looked over the code. The biggest issue I see is that you're putting "malloc" in device code. A device size malloc is very slow. To fix, I malloc'd these temp arrays on the host and them put them into a "private" clause. This helped overall performance by about 25%.

Also, the workload seems fairly small. My guess is as you increase the size of the workload, you'll also see much better relative performance.

- Mat
Back to top
View user's profile
efblack2



Joined: 28 Sep 2015
Posts: 11

PostPosted: Thu Jun 09, 2016 9:03 am    Post subject: Reply with quote

Hi Mat,

Thanks again for the advice about malloc.

I have not get any improvement in th last week.

I wonder if memory alignment could be an issue. All my arrays are allocated using malloc in the host and pcopy entering the data region.

Could memory alignment be a factor here? if yes, how can enforce alignment in the kernels?

Thanks,

Edgar Black
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group