The PGI CDK® Cluster Development Kit® enables use of networked clusters of AMD or Intel x64 processor-based workstations and servers to tackle the largest scientific computing applications. The PGI CDK includes pre-configured versions of MPI for Ethernet and InfiniBand to enable development, debugging and tuning of high-performance MPI or hybrid MPI/OpenMP applications written in Fortran, C or C++.
PGI compilers offer world-class performance and features including auto-parallelization and OpenMP 3.1 directive-based parallelization for multi-cors, OpenACC 2.0 directive-based parallel programming for accelerators and support for the PGI Unified Binary technology. The PGI Unified Binary streamlines cross-platform support by combining into a single executable file code optimized for multiple x64 processors. This assures your applications will run correctly and with optimal performance regardless of the type of x64 processor on which they are deployed or even whether you system includes an accelerator. PGI's state-of-the-art compiler optimization technologies include SSE vectorization, auto-parallelization, inter-procedural analysis and optimization, memory hierarchy optimizations, function inlining (including library functions), profile feedback optimization, CPU-specific microarchitecture optimizations and more.
Debugging a cluster MPI application can be extremely challenging. The PGDBG® debugger provides a comprehensive set of graphical user interface (GUI) elements to assist you in this process. PGDBG provides the ability to separately debug and control OpenMP threads and MPI processes on your Linux cluster. Step, Break, Run or Halt OpenMP threads or MPI processes individually, as a group, or in user-defined process/thread subsets. PGDBG can even display the state of MPI message queues, enabling you to quickly isolate and resolve message-passing deadlock bugs. Using a single integrated multi-process debugging window, PGDBG provides precise control and feedback on the state of every MPI process and OpenMP thread simultaneously, with fully integrated capabilities for debugging hybrid parallel programs that use MPI message-passing between nodes and OpenMP shared-memory parallelism within a multicore processor-based cluster node.
The main PGDBG window displays Fortran, C or C++ program source code, optionally interleaved with the corresponding x64 assembly code. In addition to the main source code window, PGDBG provides supplementary program information in a number of tabbed panels including call stack, registers, local variables, memory, a command line, events, graphical process and thread grid, status messages, MPI messages and group information. PGDBG is interoperable with the GNU gcc/g++ compilers on Linux.
PGPROF® is a powerful and easy-to-use interactive postmortem statistical analyzer for OpenMP and OpenACC programs. Use PGPROF to visualize and diagnose the performance of the components of your program. PGPROF associates execution time with the source code of your program allowing you to see where and how execution time is spent. Along with compiler feedback information, PGPROF also provides features for helping you to understand why certain parts of your program have high execution times.
Use PGPROFto analyze programs on multicore SMP servers, distributed-memory clusters and hybrid clusters where each node contains multicore x64 processors. Use the PGPROF profiler to profile parallel programs, GPU accelerated programs. PGPROF allows profiling at the function, source code line for PGI-compiled Fortran, C and C++ programs.
Using the Common Compiler Feedback Format (CCFF), PGI compilers save information about how your program was optimized, or why a particular optimization was not made. PGPROF can extract this information and associate it with source code and other performance data, enabling you to view all of this information simultaneously.
Each performance profile depends on the resources of the system where it is run. PGPROF provides a summary of the processor(s) and operating system(s) used by the application during any given performance experiment
PGPROF provides the information required for determining which functions and lines in an application are consuming the most execution time. Combined with the feedback features of the PGI compilers, PGPROF enables maximizing vectorization and performance on a single x64 processor core.
The OpenMP and MPI parallel PGDBG debugger included with the PGI CDK supports MPICH, MPICH2, MPICH3, SGI-MPI and Open MPI over Ethernet and MVAPICH and MVAPICH2 over InfiniBand clusters. MPICH was developed at the Argonne National Laboratory. MPICH is an open source implementation of the Message-Passing Interface (MPI) standard. MPICH is a full implementation of MPI, so your existing MPI applications will port easily to your Linux cluster using the PGI CDK.
MVAPICH, the "MPI over InfiniBand, iWARP and RDMA-enabled Interconnects" project is led by Network-Based Computing Laboratory, Department of Computer Science and Engineering at the Ohio State University.
Request a 30 day trial of the PGI CDK by completing the PGI CDK Evaluation Request Form.
A partial list of technical features supported includes the following:
Note: Heterogeneous systems that include both 32-bit and 64-bit processor-based workstations or servers are not supported.
Please Note: 32-bit development is deprecated with the PGI 2016 release and will no longer be available with the PGI 2017 release