PGI Accelerator Compilers for x64+GPU
The PGI 8.0 release includes a technology preview of the PGI accelerator programming strategy. Using the provisional support in PGI Release 8.0, programmers can accelerate Linux applications on x64+GPU platforms by adding OpenMP-like compiler directives to existing high-level standard-compliant Fortran and C programs and then recompiling with appropriate compiler options.
Sample Fortran matrix multiplication loop, tagged to be compiled for an accelerator.
!$acc region
do k = 1,n1
do i = 1,n3
c(i,k) = 0.0
do j = 1,n2
c(i,k) = c(i,k) + a(i,j) * b(j,k)
enddo
enddo
enddo
!$acc end region
How It Works
Until now, C and C++ developers targeting GPU accelerators have had to rely on language extensions to their programs. Use of GPUs from Fortran applications has been extremely limited. x64+GPU programmers have been required to program at a detailed level including a need to understand and specify data usage information and manually construct sequences of calls to manage all movement of data between the x64 host and GPU.
The PGI 8.0 x64+GPU compilers automatically analyze whole program structure and data, split portions of the application between the x64 CPU and GPU as specified by user directives, and define and generate an optimized mapping of loops to automatically use the parallel cores, hardware threading capabilities and SIMD vector capabilities of modern GPUs. In addition to directives and pragmas that specify regions of code or functions to be accelerated, the PGI Fortran and C compilers will support user directives that give the programmer fine-grained control over the mapping of loops, allocation of memory, and optimization for the GPU memory hierarchy. The PGI compilers generate unified x64+GPU object files and executables that manage all movement of data to and from the GPU device while leveraging all existing host-side utilities—linker, librarians, makefiles—and require no changes to the existing standard HPC Linux/x64 programming environment.
FAQ
Q What programming languages do the PGI accelerator compilers support?
A Currently, PGI is working to add support for GPU accelerators to the PGF95 Fortran and PGCC ANSI C99 compilers. While adding support for C++ is technically feasible, we have no current timeline for the availability of this capability. We welcome your feedback.
Q On which operating systems do PGI accelerator compilers run?
A Linux 64 only in PGI 8.0. There is no technical barrier to eventually supporting this same programming model on Windows and Mac OS X. Your feedback is welcome.
Q Which accelerators can be targeted by PGI 8.0 compilers?
A PGI 8.0 compilers will target all CUDA-enabled NVIDIA GPU accelerators with compute capability 1.2 or higher. PGI will announce plans to support other accelerators shortly.
Q Do I need to install the CUDA software?
A Currently, you need to download and install the CUDA software from NVIDIA. In addition, you need to create or edit a file named 'sitenvrc' in the $PGI/8.0/bin/ installation directory, to add the lines:
set CUDA=/opt/cuda; # change this to point to your CUDA installation set NVOPEN64DIR=$CUDA/open64/lib; set CUDADIR=$CUDA/bin; set CUDALIB=$CUDA/lib;
Q Does the compiler support IEEE standard floating point arithmetic?
A The GPU accelerators available today support most of the IEEE floating point standard. However, they do not support all the rounding modes, and some operations, notably square root, exponential, logarithm, and other transcendental functions, may not deliver full precision results. This is a hardware limitation that compilers cannot overcome.
Q Does the compiler support double precision?
A The technology preview supports integer and single precision floating point operations. Today's latest GPU accelerators do have support for double precision, but the performance is quite low relative to single precision. PGI plans to add support for double precision when the hardware performance improves. As always, your feedback is welcome.
Q Can I call a CUDA kernel function from my PGI-compiled code?
A PGI is working on the design of a feature to allow you to call kernel functions written in CUDA or ptx or other languages directly from your C or Fortran program. We will announce this feature when it is available.
Q Does the compiler support two or more GPUs in the same program?
A As with CUDA, you can use two or more GPUs by using multiple threads, where each thread attaches to a different GPU and runs its programs on that GPU. The current release does not include support to automatically control two or more GPUs from the same accelerator region.
Q Is there an effort to open your directives to a standards committee, like OpenMP?
A As we gain experience with our directives and programming model, we will be open to exploring a standardization effort.
Q Can I run my program on a machine that doesn't have an accelerator on it?
A Before we go to a formal production release, we will be able to use our PGI Unified Binary technology to generate code that works in the presence or absence of an accelerator.
Q Do I have to rebuild my application for each different GPU model?
A The GPU code generated uses the same technology that is used for graphics applications and games; that is, the program uses a portable intermediate format which is then dynamically translated and reoptimized at run time by the drivers supplied by the vendor for the particular model of GPU in your machine. This preserves your investment by allowing your programs to continue to work even when you upgrade your GPU card, or use your program on a machine with a different model of GPU.
Q Can I use function or procedure calls in my GPU code?
A Current GPUs do not support function calls. The compiler will support function calls only if they can be inlined.
Q When will you support <my favorite feature> in your compiler?
A Some features cannot be supported due to limitations of the hardware. Other features are not being supported because they would not deliver satisfactory performance. Still other features are planned for future implementation. Your feedback can affect our priorities.
Q Are all the specified directives support in the technical preview?
A Not all the directives in the PGI Accelerator Compiler Technology Preview white paper will be implemented at the time of the initial technology preview release. The list of directives that will be implemented and deferred are listed in the Restictions and Limitations chapter of the white paper. This list will be updated with each release of the compiler.
Q What are the requirements to participate in the technology preview program?
A PGI is limiting participation early in the accelerator compiler technology preview program. As the program matures, participation will increase. To be considered for the program, you must have a PGI Linux license with a current PGI service subscription, and you must complete the PGI Accelerator Evaluation Request form. You will be notified within three business days if your request is approved.
Availability
x64+GPU targeting capability is enabled in PGI 8.0 using a special license key. It will be available starting in mid January 2009 to selected PGI licensees during the technology preview period. It will be available only on a limited basis to ensure that PGI customers experimenting with the technology receive adequate and timely support. The preview period will last several months, with a formal production release expected in Spring 2009. PGI Linux licensees interested in participating in the technology preview can apply for consideration by completing the PGI Accelerator Compiler Evaluation Request form.
Pricing
Pricing details for PGI accelerator compilers have not been finalized yet. A production release is expected in Spring 2009, and general pricing terms will be as follows:
- Academic and government licensees with a current PGI service subscription will be able to upgrade their license to an accelerator-enabled license at no charge and with no increase in their annual subscription fee.
- Commercial licensees with a current PGI service subscription will be able to get a full license fee credit when upgrading to an accelerator-enabled license; commercial annual subscription fees will increase commensurate with the license fee increase.
- License fees for commercial ISV's interested in developing accelerator-enabled for-fee software products built with PGI accelerator compilers have yet to be determined.
Literature & Documentation
- PGI Accelerator Compiler Technology Preview White Paper
- NVIDIA CUDA Zone
- Compilers & More: A GPU Accelerator Programming Model, by Michael Wolfe, HPCwire, 9 December 2008
- Compilers & More: Optimizing GPU Kernels, by Michael Wolfe, HPCwire, 30 October 2008
- Compilers & More: Programming GPUs Today, by Michael Wolfe, HPCwire, 9 October 2008
- Compilers & More: GPU Architecture & Applications, by Michael Wolfe, HPCwire, 10 September 2008
- Compilers & More GPU articles collected (640KB PDF file)
- How We Should Program GPU's by Michael Wolfe, published by Linux Journal magazine (registration required).