If you aren't already logged in, you'll be asked to do so before accessing these pages. Sign up is free.

PGInsider Newsletters


White Papers and Specifications

  • PGI Fortran & C Accelerator Programming Model new features (ver. 1.3)

    This document describes the next generation features and capabilities planned for the PGI Accelerator programming model.

  • PGI Fortran & C Accelerator Programming Model current features (ver. 1.2 )

    This document describes the currently supported features and limitations of the PGI Accelerator programming model.

  • PGI Fortran & C Accelerator Programming Model full specification (ver. 1.0)

    This document describes a collection of compiler directives used to specify regions of code in Fortran and C programs that can be offloaded from a host CPU to an attached accelerator. The method outlined provides a model for accelerator programming that is portable across operating systems and various types of host CPUs and accelerators. The directives extend the ISO/ANSI standard C and Fortran base languages in a way that allows a programmer to migrate applications incrementally to accelerator targets using standards-compliant Fortran or C.

  • CUDA Fortran Programming Guide and Reference (ver. 1.3, May 2010, 136KB)

    NVIDIA CUDA™ a general purpose parallel programming architecture with compilers and libraries to support the programming of NVIDIA GPUs. This document describes CUDA Fortran, a small set of extensions to Fortran that supports and is built upon the CUDA computing architecture.

  • Common Compiler Feedback Format (CCFF) Draft Specification

    CCFF is the Common Compiler Feedback Format, initially defined and implemented by PGI. PGI compilers add CCFF information to object and executable files that can be extracted into a file or read directly from the section of the object or executable file. The CCFF information is stored as an XML file, whose structure we describe here. See also CCFF XML schema and CCFF Repository XML schema

Technical Papers and Presentations

  • Michael Wolfe's SC12 GPU Programming with OpenACC Tutorials

    Introduction slide deck 783KB PDF
    Advanced slide deck 602KB PDF
    Tutorial examples 17KB TAR

  • Michael Wolfe's SC11 Advanced GPU Programming Tutorial

    Slide deck 967KB PDF
    Tutorial examples 555KB TAR

  • Michael Wolfe's GPU Programming Tutorial

    Course materials from Michael Wolfe's day long tutorial on programming GPUs using the directive-based PGI Accelerator compilers and CUDA Fortran.

    1. Introduction 341KB PDF
    2. CUDA Fortran 905KB PDF
    3. PGI Accelerator Compilers 909KB PDF
    4. Conclusion 112KB PDF

    Tutorial examples and labs 25.5MB TAR

  • Optimizing Application Performance on x64 Processor-based Systems with PGI Compilers and Tools
    by Douglas Miles, Brent Leback and David Norton

    PGI Fortran, C and C++ compilers and tools are supported on most x64 processor-based systems. Optimizing performance of the x64 processors in these systems often depends on maximizing SSE vectorization, ensuring alignment of vectors, and minimizing the number of cycles the processors are stalled waiting on data from main memory. The PGI compilers support a number of directives and options that allow the programmer to control and guide optimizations including vectorization, parallelization, function inlining, memory prefetching, interprocedural optimization, and others. In this paper we provide detailed examples of the use of several of these features as a means for extracting maximum single-node performance from x64 processor-based systems using PGI compilers and tools.

  • Tuning C++ Applications for the Latest Generation x64 Processors with PGI Compilers and Tools
    by Douglas Doerfler and David Hensinger
    Sandia National Laboratories
    Brent Leback and Douglas Miles
    The Portland Group (PGI)

    Tuning numerically intensive C++ applications for maximum performance can be a challenge. This paper illustrates the importance of SSE vectorization on modern processors, and uses the ALEGRA shock physics code as an example of how a C++ application can be re-structured to enable vectorization and other optimizations that lead to dramatic performance improvements.

Click me

This site uses cookies to store information on your computer. See our cookie policy for further details on how to block cookies.