PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Problem:Fortran code with open ACC doesn't gain any speed up
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
rzou1



Joined: 03 Feb 2014
Posts: 5

PostPosted: Thu Feb 06, 2014 9:52 am    Post subject: Problem:Fortran code with open ACC doesn't gain any speed up Reply with quote

I am new to the open ACC, and encountered a puzzling problem. I was learning the open ACC using a code in the PGI AcceleratorTM Compilers OpenACC Getting Started Guide at:http://www.pgroup.com/doc/openACC_gs.pdf.
The code is on pages 13 and 14 of the document.

I used the PGI Visual Fortran with Visual Studion shell, and the code was compiled for both "enable open ACC directive" and no ACC directive. However, after ran both versions, to my surprise the time used for both the versions is almost identical (I changed the n to a much large value of 500000000 to allow sufficient work load). In other words, the open ACC didn't make the program run faster in this case.

The accelerator of my computer is: NVIDIA Quadro K2000M. It is a Win 7 Quad Core laptop.

Any advice and suggestions will be highly appreciated.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6215
Location: The Portland Group Inc.

PostPosted: Thu Feb 06, 2014 2:39 pm    Post subject: Reply with quote

Hi rzou1,

Assuming that you're running the vecadd example, when using n=500000000 the total memory usage will ~5.5GB. Since your card only as 2GB I would expect the run to get an out of memory error if it ran on the device. Does it seem to work if you change n back to it's original value?

If not, then you may not have the appropriate flags set to generate the OpenACC, or your device may not be configured so the code is run on the host (by default both a host and GPU version of the code is created)

What does your build log look like? What is the output of the "pgaccelinfo" utility (run from a PGI DOS command shell)?

- Mat
Back to top
View user's profile
rzou1



Joined: 03 Feb 2014
Posts: 5

PostPosted: Thu Feb 06, 2014 3:19 pm    Post subject: re:Problem:Fortran code with open ACC doesn't gain any speed Reply with quote

Hi Mat, thanks for your quick response. Here is how I did the compiling in the IDE of the Visual Studio:
1) Properties-->Fortran-->Preprocess-->Preprocess source file;
2) Properties--->Fortran-->Language-->Enable Open ACC Directives
3) Properties-->Fortran-->Target Accelerators-->Target NVIDIA TESLA
4) Properties-->Fortran-->Target Host

I reduced the n to n = 50000000, and it takes the code 2.4 seconds to run

Then I remove the "target accelerators", i.e., only with the 1), 2), and 4) above, compile, then it takes about 2.4 seconds to run the code.

I then turn off the Open ACC directive, i.e., only with 1) above, and compile, then it takes about 2.4 seconds to run the code.

BTW, when I activate all the 1) to 4), the compiling information is as:
vecaddgpu:

17, Generating copyin(a(:n))
Generating copyin(b(:n))
Generating copyout(r(:n))
Generating NVIDIA code
18, Loop is parallelizable
Accelerator kernel generated
18, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
Linking...
vector_add build succeeded.

Thanks a lot!
Back to top
View user's profile
rzou1



Joined: 03 Feb 2014
Posts: 5

PostPosted: Thu Feb 06, 2014 3:30 pm    Post subject: re:Problem:Fortran code with open ACC doesn't gain any speed Reply with quote

Hi Mat, the Build.log is like:


Compiling Project ...

..\..\vector_addition_from_the_startup_guide.f90

c:\program files\pgi\win64\14.1\bin\pgfortran.exe -Hx,123,8 -Hx,123,0x40000 -Hx,0,0x40000000 -Mx,0,0x40000000 -Hx,0,0x20000000 -Mpreprocess -g -Bstatic -Mbackslash -acc -Mfree -I"c:\program files\pgi\win64\14.1\include" -I"C:\Program Files\PGI\Microsoft Open Tools 12\include" -I"C:\Program Files (x86)\Windows Kits\8.1\Include\shared" -I"C:\Program Files (x86)\Windows Kits\8.1\Include\um" -ta=tesla,host -Minform=warn -module "x64\Debug" -Minfo=accel -o "x64\Debug\vector_addition_from_the_startup_guide.obj" -c "C:\Allstuff\EFDC\CUDA_open_ACC_related\Open_ACC_examples\Vector_addition\vector_addition_from_the_startup_guide.f90"

Command exit code: 0

Command output: [NOTE: your trial license will expire in 12 days, 6.56 hours. NOTE: your trial license will expire in 12 days, 6.56 hours. vecaddgpu: 17, Generating copyin(a(:n)) Generating copyin(b(:n)) Generating copyout(r(:n)) Generating NVIDIA code 18, Loop is parallelizable Accelerator kernel generated 18, !$acc loop gang, vector(128) ! blockidx%x threadidx%x ]

Linking...

c:\program files\pgi\win64\14.1\bin\pgfortran.exe -Wl,/libpath:"c:\program files\pgi\win64\14.1\lib" -Wl,/libpath:"C:\Program Files\PGI\Microsoft Open Tools 12\lib\amd64" -Wl,/libpath:"C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x64" -Yl,"C:\Program Files\PGI\Microsoft Open Tools 12\bin\amd64" -g -Bstatic -acc -ta=tesla,host -o "C:\Allstuff\EFDC\CUDA_open_ACC_related\Open_ACC_examples\Vector_addition\vector_add\vector_add\x64\Debug\vector_add.exe" "x64\Debug\vector_addition_from_the_startup_guide.obj"

Command exit code: 0

vector_add build succeeded.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6215
Location: The Portland Group Inc.

PostPosted: Thu Feb 06, 2014 3:57 pm    Post subject: Reply with quote

It looks like it's building fine. Try adding the "-ta=tesla:time" in the "Command Line" options in the property pages, or sent the environment variable "PGI_ACC_TIME=1" in the DOS cmd window and run your program from there. The program should then print out profiling information if it ran on the GPU.

One thing to keep in mind is that vecadd is a very simple example with very little computation. Hence, you may not see much speed-up. Instead, you might want try the Matmul example in: C:\Program Files (x86)\Microsoft Visual Studio 11.0\PGI Visual Fortran\Samples\gpu\AccelPM_Matmul

Note that your path to the Matmul may be different. Also, this example uses the PGI Accelerator Model which is the precursor to OpenACC's "kernel" construct.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group