For users interested in compiling their code to run reasonably fast, we provide this list of recommended default flags. For users interested in achieving peak performance, a list of tuning flags follows below.
PGI Recommended Default Flags
| Compiler | Flags |
|---|---|
| PGFORTRAN | -fast -Mipa=fast,inline |
| PGCC | -fast -Mipa=fast,inline -Msmartalloc |
| PGC++ | -fast -Mipa=fast,inline -Msmartalloc |
Where:
| -fast | A generally optimal set of options including global optimization, SIMD vectorization, loop unrolling and cache optimizations. |
| -Mipa=fast,inline | Aggressive inter-procedural analysis and optimization, including automatic inlining. |
| -Msmartalloc | Use optimized memory allocation (Linux only). |
PGI Tuning Flags
| Flag | Usage |
|---|---|
| -Mconcur | Enable auto-parallelization; for use with multi-core or multi-processor targets. |
| -mp | Enable OpenMP; enable user inserted parallel programming directives and pragmas. |
| -Mprefetch | Control generation of prefetch instructions to improve memory performance in compute-intensive loops. |
| -Msafeptr | Ignore potential data dependencies between C/C++ pointers. |
| -Mfprelaxed | Relax floating point precision; trade accuracy for speed. |
| -tp=cpua,cpub | Create a PGI Unified Binary for two or more cpu types, which functions correctly on and is optimized for two cpus. For example, -tp=sandybridge,bulldozer optimizes for both Intel 'Sandybridge' and AMD 'Bulldozer' cpu types. |
| -Mpfi/-Mpfo | Profile Feedback Optimization; requires two compilation passes and an interim execution to generate a profile. |
Please see the PGI Compiler Reference Manual for detailed flag information. Find more specific information for tuning many popular community applications on the Porting & Tuning Guides page.
Inline intrinsic functions map to actual x86 or x64 machine instructions. Intrinsics are inserted inline to avoid the overhead of a function call. The compiler has special knowlege of intrinsics, so with use of intrinsics, better code may be generated as compared to extended inline assembly code. Intrinsics are available in C and C++ programs running on Linux or Windows only.
See the PGI Compiler User's Guide, Chapter 15, for more information about intrinsics.
Declare the NO_STOP_MESSAGE environment variable and assign it any value.
If any of the compilers fail from time to time with a SIGSEGV or SIGNAL 11 interrupt, it could be your temp directory.
A problem like this where a compile has been terminated
% pgfortran x.f90 -fast
pgf90-Fatal-/usr/pgi/linux86-64/12.0/bin/pgf901 TERMINATED by signal 11
Arguments to /usr/pgi/linux86-64/12.0/bin/pgf901
/usr/pgi/linux86-64/12.0/bin/pgf901 x.f90 -opt 2 -terse 1 -inform warn -nohpf -nostatic
-x 19 0x400000 -quad -x 59 4 -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c
-x 58 0x10000 -x 124 0x1000 -x 57 0xfb0000 -x 58 0x78031040 -x 70 0x6c00 -x 47 0x400000
-x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /usr/pgi/linux86-64/12.0/include:/usr/local/
include:/usr/lib/gcc/x86_64-redhat-linux/4.1.2/include:/usr/lib/gcc/x86_64-redhat-linux/
4.1.2/include:/usr/include -def unix i -def __unix -def __unix__ -def linux -def __linux
-def __linux__ -def __NO_MATH_INLINES -def __x86_64__ -def __LONG_MAX__=9223372036854775807L
-def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW=
-def __extension__= -def __amd64__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__
-def __SSSE3__ -preprocess -freeform -vect 48 -y 54 1 -x 53 2 -quad -x 119 0x10000000
-modexport /tmp/pgf90Em9bAdWSVOUt.cmod -modindex /tmp/pgf90om9bQEWzq5Vy.cmdx
-output /tmp/pgf90Um9bkYHrQ0B4.ilm
^^^^^^^^^^^^^^^
The file /tmp/pgf90Um9bkYHrQ0B4.ilm is one of several temporary files created by the compiler through the course of compiling and linking source code. You can change the temporary directory by setting the TMPDIR environment variable to another directory.
Whichever directory you use for temporary files, make sure you have enough space in it, and permissions to write to it, or you may see errors like the above. Space constraints can also cause truncated temp files leading to other strange behavior.
The statements
#include "filename"
include "filename"
are handled differently in pgf77 and pgfortran. #include is a preprocessor statement, while the indented include statement is handled by the front end of the compiler.
To handle files with #include statements, either rename the file from x.f to x.F, or use the switch ‑Mpreprocess in your compile line.
Some users want to keep gcc, and call pgcc as cc. The easiest way to do this is to create a script file named cc and make sure your path is set up to find the cc script file. The script file, using csh syntax, is:
#!/bin/csh
setenv PGI /usr/pgi #! or wherever pgcc is installed
set path=($PGI/linux86/bin $path)
pgcc $*
If the PGI environment variable is already set, then delete the setenv command.
We don't recommend renaming pgcc to cc. There are several changes necessary for this to work correctly, and each new release can cause problems due to changes in the driver structure.
The SIZE intrinsic is set to the same function type as the default INTEGER, which is four bytes with the 64-bit compilers. Compiling with -i8, the default integer size is eight, and the SIZE intrinsic will now be eight bytes.