“Using OpenACC allowed us to continue development of our fundamental algorithms and software capabilities simultaneously with the GPU-related work. In the end, we could use the same code base for SMP, cluster/ network and GPU parallelism. PGI's compilers were essential to the success of our efforts.”
Who's using OpenACC?
Researchers, scientists and engineers worldwide are using PGI compilers and OpenACC to GPU accelerate over 200 cutting edge codes in a variety of domains including molecular dynamics, CFD, quantum chemistry, weather and climate, astrophysics, and others.


Mike Frisch, Ph.D.
President and CEO
Gaussian, Inc.


Sunil Sathe
Lead Software Developer
ANSYS Fluent
“We recently started evaluating OpenACC for parallelization on multicore CPUs. Based on some early work, one of our OpenACC-based solvers is as fast as the OpenMP version on multicore CPUs and delivers speed-ups of 4-6x on a Tesla P100 compared to all the cores of a dual-socket server.”


Prof. Georg Kresse
Computational Materials Physics
University of Vienna
“For VASP, OpenACC is the way forward for GPU acceleration. Performance is similar to CUDA C, and OpenACC dramatically decreases GPU development and maintenance efforts.”


Richard Loft
Director, Technology Development
NCAR
“Our team has been evaluating OpenACC as a pathway to performance portability for the Model for Prediction (MPAS) atmospheric model. Using this approach on the MPAS dynamical core, we have achieved performance on a single P100 GPU equivalent to 2.7 dual socketed Intel Xeon nodes on our new Cheyenne supercomputer.”


Takuma Yamaguchi, Kohei Fujita, Tsuyoshi Ichimura, Muneo Hori,
Lalith Wijerathne
University of Tokyo
“With OpenACC and a compute node based on NVIDIA's Tesla P100 GPU, we achieved more than a 14X speed up over a K Computer node running our earthquake disaster simulation code.”


Dr. Oliver Fuhrer
Senior Scientist
Meteoswiss
“OpenACC made it practical to develop for GPU-based hardware while retaining a single source for almost all the COSMO physics code.”


Zhihong Lin
Professor and Principal Investigator
UC Irvine
“Using OpenACC our scientists were able to achieve the acceleration needed for integrated fusion simulation with a minimum investment of time and effort in learning to program GPUs.”


John Stone
Senior Research Programmer
Beckham Institute
University of Illinois
“Due to Amdahl's law, we need to port more parts of our code to the GPU if we're going to speed it up. But the sheer number of routines poses a challenge. OpenACC directives give us a low-cost approach to getting at least some speed-up out of these second-tier routines. In many cases it's completely sufficient because with the current algorithms, GPU performance is bandwidth-bound.”


Ronald M. Caplan
Computational Scientist
Predictive Science Inc.
“Adding OpenACC into MAS has given us the ability to migrate medium-sized simulations from a multi-node CPU cluster to a single multi-GPU server. The implementation yielded a portable single-source code for both CPU and GPU runs. Future work will add OpenACC to the remaining model features, enabling GPU-accelerated realistic solar storm modeling.”


Somnath Roy
Assistant Professor
Mechanical Engineering Department
Indian Institue of Technology Kharagpur
“Using OpenACC to accelerate our immersed boundary incompressible CFD code, we’re seeing an order of magnitude reduction in computing time running on GPUs. Routines involving our search algorithm and matrix solvers perform especially well with OpenACC, and improve the overall scalability of the code.”


David Gutzwiller
Lead Software Developer
NUMECA
“Porting our unstructured C++ CFD solver FINE/Open to GPUs using OpenACC would have been impossible two or three years ago, but OpenACC has developed enough that we’re now getting some really good results.”


Abhilash Jayaraj
Project Scientist
India Institute of Technology New Delhi
“In an academic environment, maintenance and speedup of existing codes is a tedious task. OpenACC provides a great platform for computational scientists to accomplish both tasks without involving a lot of efforts or manpower in speeding up the entire computational task.”


Dr. Lutz Schneider
Senior R&D Engineer
Synopsys Inc.
“Using OpenACC, we've accelerated the Synopsys TCAD Sentaurus Device EMW simulator to speed up FDTD simulations of image sensors by a factor of 15 on a single V100 GPU compared to a dual-socket Broadwell server. GPUs are key to improving simulation throughput in the design of advanced image sensors.”


Mark A. Taylor
Multiphysics Applications
Sandia National Laboratory
“The CAAR project provided us with early access to Summit hardware and access to PGI compiler experts. Both of these were critical to our success. PGI’s OpenACC support remains the best available and is competitive with much more intrusive programming model approaches.”


Misun Min
Computational Scientist
Argonne National Laboratory
“The most significant result from our performance studies is faster computation with less energy consumption compared with our CPU-only runs. The GPU required only 39 percent of the energy needed for 16 CPUs to do the same computation. That OpenACC is an open standard was an important factor in our decision to use it for our research.”


Adam Jacobs
PhD Candidate
Stony Brook University
“For scientific applications that run on several different supercomputing architectures and need to be usable for many generations of architecture, the cons of something like CUDA outweigh the pros. That’s why we prefer OpenACC.”


Richard Sandberg
Investigator
University of Melbourne
“For a relatively small production case of about 80 million grid points, our OpenACC-enabled version running on two GPU nodes with 4 P100 GPUs each was approximately 20.5x faster than two nodes with 24 Haswell cores each.”


Michael Ni
CEO
AeroDynamic Solutions, Inc.
“Using OpenACC on a single Tesla V100 in the Amazon Cloud, our GPU-accelerated Code Leo flow solver runs 1.4 times faster at 30% less cost compared to runs using all 72 vCPU cores of a Xeon Platinum c5.18xlarge instance. We feel this will revolutionize the aerospace design cycle, enabling the delivery of more durable, more reliable and higher performing designs at reduced development cost.”


Michael Barad
Research Aerospace Engineer
NASA Ames Research Center
“We used OpenACC to port our LAVA Lattice-Boltzman mini-app to GPU. Running a single block grid at 2563, one Volta V100 was 5x faster than two 40–core Skylake CPUs.”


Munikrishna Nagaram
Chief Technology Officer
S&I Engineering Solutions Pvt. Ltd.
“OpenACC allowed us to port our legacy CFD solvers to hybrid CPU+GPU platforms in a form that is readable and maintainable, and enabled us to exploit GPUs in our HiFUN MPI CFD solver in very little time.”


Bronson Messer
Senior Scientist
Oak Ridge National Laboratory
“We’re using OpenACC on Summit to accelerate our most compute-intensive kernels. We love OpenACC interoperability and how this allows us to use multiple methods to perform memory placement and movement. CPU+GPU performance of a 288 species network on Summit, something impossible to do on Titan, is 2.9x faster than CPU only.”


Ludwig Schneider
PhD Student
Georg-August-Universität Göttingen
“OpenACC enables us to compile a single code base for multiple architectures. This keeps the code maintainable and flexible even for future accelerators. For our OpenACC accelerated “SCMF” algorithm, a single V100 outperforms 24 CPU cores by roughly a factor of 10.”


Igor Sfiligoi
HPC Software Developer
General Atomics
“We planned to spend a month porting our OpenACC code from X86 + GPUs to POWER9 + GPUs. Using the PGI compilers and our standard build environment, we were running in an afternoon. It just worked!”


C. S. Chang
Principal Investigator
Princeton Plasma Physics Laboratory
Princeton University
“Using a combination of CUDA and OpenACC for our most compute-intensive kernels, the GPU-accelerated version of XGC delivers over 11x speed-ups compared to CPU-only execution when running at scale on 2048 nodes of ORNL’s new Summit supercomputer.”


Dmytro Bykov
Computational Scientist
Oak Ridge National Laboratory
“Using OpenACC, we see large performance gains with very little effort. GPU acceleration varies over the course of our simulations due to differing fragment sizes, but is typically 3x–5x. On Summit we can now do simulations of several thousand atoms, compared to maybe 800 on Titan.”


Gabriel Staffelbach
Senior Researcher
CERFACS
“OpenACC allows us to develop and maintain a state-of-the-art CFD application on GPUs with full portability to other platforms. We have only 20 lines of GPU-specific code in our 300,000 line source base. The optimization feedback provided by the OpenACC compiler was invaluable in our porting efforts.”