Technical News from The Portland Group

 

In This Issue | SEP 2013

Tesla vs. Xeon Phi vs. Radeon

OpenACC Interoperability Tricks

Improving Application Performance with Allinea MAP

Using Derived Types with the PGI Fortran 2003 Compiler

Calling CUDA Fortran Kernels from MATLAB

Upcoming Events

PGI will be exhibiting in booth #2509 at SC13 November 18-21 in Denver, Colorado.

PGI will be presenting a two day tutorial and coding workship at Sandia National Laboratories on 8-9 October.

Resources

OpenACC website

PGI Accelerator with OpenACC
Getting Started Guide

PGI Accelerator

CUDA Fortran

PGI User Forums

Recent News

NVIDIA Acquires PGI

PGI Accelerator Compilers Add Support for AMD APUs and GPUs

Next Issue

OpenACC 2.0

OpenACC Success Stories

x86 Performance Enhancements in PGI 2013

Introducing the SPEC ACCEL Benchmarks

The Portland Group
Suite 320
Two Centerpointe Drive
Lake Oswego, OR 97035

Michael Wolfe

Tesla vs. Xeon Phi vs. Radeon

Michael Wolfe's
Programming Guide

NVIDIA Tesla, Intel Xeon Phi and AMD Radeon have many common features that can be leveraged to create an accelerator programming model that delivers language, functional and performance portability across devices.

In this article, Michael Wolfe takes a comprehensive look at the HPC accelerator landscape. He starts by describing the key characteristics of each platform from a programming standpoint calling out similarities and differences of significance to developers. From there, Michael dives into programming options comparing and contrasting a number of popular programming models including OpenCL, CUDA, C++ AMP, OpenMP and OpenACC. He concludes with a "wish list" of accelerator programming features and a set of recommendations. | Continue to the article…

OpenACC Interoperability Tricks

OpenACC is a powerful programming model for systems with accelerators such as a GPUs. Part of OpenACC's power comes from the programmer's ability to express parallelism at a high level while still obtaining a reasonable speed-up over the CPU. Interoperability with other GPU programming models gives programmers the maximum flexibility to choose an appropriate programming model and maximum performance by choosing the programming model that works best for a given situation. In this article Jeff Larkin demonstrates a number of ways to combining OpenACC with other common parallel programming methods. | Continue to the article…

Improving Application Performance with Allinea MAP

David Lecomber, CTO at Allinea Software, demonstrates using Allinea MAP with PGI 2013. He walks through a simple matrix multiply example, using it to identify performance limitations and how to use MAP in combination with PGI compiler ‑Minfo output messages to quickly optimize this example and realize a better than 12x speed-up. | Continue to the article…

Object-Oriented Programming in Fortran 2003 Part 3: Parameterized Derived Types

Parameterized derived types let programmers create derived types that take values, known as parameters, to specify characteristics of the encapsulated data. These characteristics can specify the precision and amount (or length) of data. Parameterized derived types offer two distinct advantages: they provide a way to specify variable length data without requiring an explicit dynamic allocation and they permit code reusability because they don't require a rewrite to use different kinds of data. | Continue to the article…

Calling CUDA Fortran Kernels from MATLAB

In this brief hands-on example, Massimiliano Fatica outlines what's involved and walks through the steps required to call a CUDA Fortran kernel from MATLAB. | Continue to the article…

Improve your cluster performance by 75%

Improve your cluster performance by up to 75%.