
Michael Wolfe's Programming Guide
The PGI Accelerator Programming Model on NVIDIA GPUs—Part 2 Performance Tuning
Part 1 of this series introduced the PGI Accelerator Programming Model, showed three simple programs in C and in Fortran, and presented a few details for building and running a program on the GPU.
Part 2 looks at issues affecting performance including how to recognize and address them. It includes an in-depth look at the four most important performance issues: writing an appropriately parallel algorithm, tuning the data movement between the host and the accelerator, tuning memory loads and stores on the accelerator, and tuning the loop schedule.
Also in this issue…
Tool Tips: A New Direction for PGI Performance Profiling
This article introduces the new easy-to-use performance data collection tool PGCOLLECT.