PGI Release 2016 version 16.4 and newer include updated FlexNet license daemons (version 11.13.1.3). This update addresses a FlexNet security vulnerability. We recommend all users update their license daemons. See the FlexNet Update FAQ for more information. This update also requires you to update your PGI license keys to a new format. Older keys are incompatible.

Which types of licenses are available?

There are four types of licenses (actually license keys) used with PGI Compilers and Tools for x86-based systems:

  • Starter, provided by PGI.
  • Community, provided by PGI.
  • Node-locked, created by a user.
  • Network floating, created by a user.

The first two license key types are temporary and in most cases work with any PGI release from version 7.2 newer.

  • A Starter license key typically has a 30 to 90 day time limit. It also introduces a time limit into compiled executable files - they will stop working when the license key expires. Starter license keys may be bundled in PGI download packages or provided directly to users for evaluation purposes. They work regardless of whether the FlexNex license manager (lmgrd) is running.
  • Community license keys are bundled in PGI download packages for versions designated a 'Community Edition' release. These are valid for one year from date of release, are not FlexNet managed and can be installed on any supported system type. There are no restrictions or time limits on compiled executable files.

The other license keys are permanent and perpetual — they never stop working for supported PGI versions. Except as noted below, all PGI permanent licenses are specific to individual PGI products supporting a single type of platform.

  • Node-Locked licenses restrict use to a particular host and one user at a time. They are very useful when a number of users want to share a PGI product on a single system. Node-locked licenses for x86 systems require the license service (lmgrd) run on the same machine the compilers are run. Compiled executable files can run on any compatible machine. Node-locked licenses for OpenPOWER use a proprietary licensing scheme.
  • Network floating license keys allow the license service to run on a machine different from the machines running the compilers. The number of concurrent seats in-use is determined by counting all unique user-system combinations currently running the PGI compilers. The maximum number of concurrent seats is limited by the number of seats specified in the license key. Floating license keys usually require only a single license server. Organizations with floating licenses can allow users to "borrow" seats for out-of-office compiling.

Can you please give me an overview of a PGI license key?

Here is a synopsis of a typical PGI permanent license key. It is broken down into sections, with important parts labeled. The dashed lines and (!) comments are added, and the parts in parantheses are optional.

------------------------------------------------------------------------
SERVER <license server hostname> 0123456789ab 27000        !SERVER line
------------------------------------------------------------------------
DAEMON pgroupd (/path/to/pgroupd) (PORT=port_number)       !DAEMON line
------------------------------------------------------------------------
PACKAGE PGI<release>-<PGI PIN> pgroupd <support end date> COMPONENTS= \
<…>
6167 9C37 2315 DAA9 EFAE"
INCREMENT PGI<release>-<PGI PIN> pgroupd <support end date> permanent <# of seats> \
   VENDOR_STRING=<PGI PIN>:16:ws:accel HOSTID=<your hostid>
<…>
7015 3F05 ACDF 1B73 FB12"

The SERVER line has three components, the hostname of the license server, the hostid of the license server, and the PORT used by lmgrd to process the license requests. You can edit the hostname and PORT used (27000 default) by hand without regenerating the license.

The DAEMON line has three components, the name of the DAEMON used (pgroupd), the path to the daemon, if not where lmgrd is located (as in /usr/pgi/daemon/pgroupd) and a PORT which pgroupd would use to communicate. The optional PORT used can be any unused integer that the Operating System allows, which you can change by hand in the license file.

The next section begins with a line starting with PACKAGE, listing all available component features for the given license. This is followed by a section with a line beginning with INCREMENT. Both the PACKAGE and INCREMENT lines contain a date-formatted number like 2018.1231. This number designates when Support Service expires and with which releases the license will work. In this case, Support Service for the license expires on December 31 2018, and all versions released on or prior to that date are supported by this license key. The license key does not need to be updated until Support Service is renewed with a later date of expiration.


What are the most common license problems?

The most common license problems are:

  • Trying to run a version released after the Support Service expiration date in the license key.
  • Host-id used in the license key does not match the machine the license server is running on.
  • The hostname used is not being recognized properly by the license utilities.
  • Firewalls prevent the licensing daemons lmgrd and pgroupd from communicating.
  • Using a node-locked license on a machine other than the license server.

What is the 'hostid' or 'host-id' ?

On Windows, Linux, and OS X, the hostid is the ethernet address (a/k/a MAC, 12 hex digits) of the network card that is configured. Older releases supported only device eth0 for FlexNet style licensing, but releases 9.0 and later support multiple configured network cards. If your license works with the current release, it should work with previous releases back to 7.2. Making the current release license work often requires the version of pgroupd be the one included with the current release.

Whichever license is used, the hostid used to create the PGI license must not change, or the license validation will fail. For example, a laptop that disables the ethernet network card when not connected. In this case, the laptop should use a different hostid, or refrain from using the compilers when on wifi or no network.


How do I know if the license hostid has changed, and what should I do then?

The SERVER line of your license file has the form

SERVER hostname_of_the_license_server 001122334455 27000

In the above, 001122334455 is the hostid, and 27000 is the PORT used for lmgrd.

If you have a permanent license, run

lmutil lmhostid

in the Windows, Linux, or OS X environment. lmutil resides in the same directory as the PGI compilers.

See if any of the hostid values displayed agree with the one found in your license.dat file. There can be more than one displayed.

If the hostid has changed, then login to your account and create new license keys. If you don't have an account, you'll need to register first. The PIN Code used to tie the license PIN to your account can be found in the original PGI order confirmation you received at the time of purchase. If you do not have your order acknowledgment then please user the PGI support request form to request that your PIN(s) be tied to your account.


If the hostid has not changed and the license manager fails, what next?

Use the following checklist to eliminate the easy things. But first, run the following commands after the compilers have been installed and the environment set up. lmutil resides in the PGI bin directory.

lmutil lmhostid           ! obtain the hostids that are detectable by the Flex utilities 
lmutil lmhostid -hostname ! obtain the hostname 
lmutil lmhostid -internet ! obtain the IPaddress of the platform. (Best if same as 'ping hostname' from previous.) 

The following applies to FlexNet style node-locked and floating licenses only. Most problems with running license daemons are relatively minor although they may not seem like it at the time.

  • Is the compiler looking at the right license file?

    You should see different behavior if you remove or rename the license.dat file. Check to see that $PGROUPD_LICENSE_FILE or $LM_LICENSE_FILE is set properly. One of these environment variables should be set to the full pathname of the license file, or for Network floating (PGI Professional Edition) licenses can also be of the form port_number_in_license_key_file@hostname_of_the_license_server. (Note: on the license server itself, $PGROUPD_LICENSE_FILE or $LM_LICENSE_FILE should only be set to the full pathname of the license file.)

    Windows expects the license key file to be at C:\Program Files\PGI\license.dat, and macOS expects it to be at /opt/PGI/license.dat. Linux looks first at $PGROUPD_LICENSE_FILE, then at $LM_LICENSE_FILE, and then $PGI/license.dat

  • Is the hostname of the license server in the SERVER line of the license file, a name that every machine using the compilers (including the license server) can use to communicate with the license server? In other words, will ping server_name on the client give the same IPaddress as ping server_name on the server? Floating Licenses only work on a single network.

    If the hostname needs to change, you can edit the license file by hand without regenerating the license key. If a license key has worked before, try using the same hostname and PORT number (usually 27000) in the SERVER line, as the previous working license key.

    ping hostname_of_the_license_server

    succesfully from license_server to itself (for FlexNet style node-locked), or from the client platform to the license_server (for FlexNet style floating licenses), it may be a candidate. But it may not be.

    The PGI FlexNet utility lmutil can provide a possible hostname_of_the_license_server. Try

    lmutil lmhostid -hostname

    which should return HOSTNAME=xxxxxx and the possible hostname_of_the_license_server would be xxxxxx in this case. But it may not work, due to DNS and other reasons.

    Many node-locked licenses can use the standard localhost name for the hostname_of_the_license_server and it often bypasses DNS.

    Users can edit by hand the hostname_of_the_license_server on the SERVER line of the license file without having to regenerate the license.

  • Does the hostid agree with the one in license.dat?

    See the previous section.

  • Is $PGI defined properly (e.g. export PGI=/opt/pgi) ?

    Entering the command

    set | grep PGI 

    will determine if it is properly defined.

  • Is $PGROUPD_LICENSE_FILE defined properly (e.g. export PGROUPD_LICENSE_FILE=/opt/pgi/license.dat)?

    Entering the command

    setenv | grep PGROUPD_LICENSE_FILE

    will determine if it is properly defined.

  • Has the license.dat file been modified improperly?

    This file must adhere to a specific format. Specifically,

    • no extra lines
    • no linefeeds or returns between lines ending with '\' and the next line(s).
  • Has the line "DAEMON pgroupd" in license.dat been modified to point to the pgroupd location? As an example

    DAEMON pgroupd /usr/pgi/linux86/16.1/bin/pgroupd 

    Note that PGI Linux Release 2016 (16.4) and later licenses require the pgroupd version 11.13.1.3 included with the release packages.

  • Could there be a problem with lmgrd or pgroupd?

    To be sure, stop them all and restart the license manager. Enter the following commands:

    ps ax | grep lmgrd

    or

    $PGI/linux86-64/16.1/bin/lmutil lmstat
    
    ps ax | grep pgroupd

    If processes lmgrd and pgroupd are running, kill them off. Then enter the command

    $PGI/linux86-64/16.1/bin/lmgrd.rc restart


Okay, the simple things check out, what next?

Check the Flexera Software (Flexera publishes FlexNet Publisher) support center.


Where can I find my PIN?

When communicating with PGI, please provide your PIN (Product Identification Number) with your inquiry. Your PIN can be found in your license file. Typically the license file is found at $PGI/license.dat (Linux and Mac OS X) or C:\Program Files\PGI\license.dat (Windows). In the license.dat file, look for

VENDOR_STRING=xxxxxx 

where xxxxxx is a six digit number starting with a 1, 5, 7, or 9. This is your PIN. If your file is not in $PGI, look at the location $PGROUPD_LICENSE_FILE.


How do I create/delete/move a license?

We recommend that you install the software first on the system you are using, or intend to move to, before trying to obtain a license for that system. If you are moving your compilers to a new system, DO NOT remove the compilers from your previous machine until things are working on your new system. Avoid leaving yourself with no working compilers in the event you have temporary license problem during the transfer.

Determine your PIN as described in the PIN FAQ above.

If you cannot tie your PIN to your account, use the PGI support request form to contact PGI License Support, and provide as much info as you know about the original purchase so we can search. Usually, the name or email address of the purchaser is enough. More information may be needed if you have multiple licenses at your site.

To create a license key, you will need a hostid for the device that is managing the license. If you have a node-locked or a Developer license, this is the hostid of the machine you installed the compilers on. If you have a network floating license, it is the hostid of the license server.

To move a license from one machine to another, log in to your PGI account and click on the "Create Permanent Keys" link. Each PIN tied to your account is linked to the license generator. First, delete your current license and then create a new license using your new hostid. You may have to wait for PGI to approve your request if you move or delete your license more often than twice a year.

We appreciate the effort you have to go through to create or change a license, and we apologize for these extra steps needed to preserve the integrity of our software products.


What do I do when lmutil lmhostid returns a blank?

PGI licenses use FlexNet licensing components to manage compiler licenses. Licenses are generated based upon the unique output on each host of the FlexNet command:

lmutil lmhostid

On Linux systems, this command returns the MAC address of the network cards configured. Older releases (pre 9.0) only returned hostids for net cards configured as eth0. This does not mean you have to be connected to the Internet, but it needs to be "visible" at runtime. For example, if you have root privilege and type /sbin/ifconfig, you will see something like this:

/sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:BE:FE:BE:EF:FD  
          inet addr:167.6.543.21  Bcast:167.1.234.456  Mask:255.255.254.0
          inet6 addr: beef::222:beef:feed:1234/64 Scope:Link
          UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1234560 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0654321 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:7777 
          RX bytes:1212121212 (4321.0 Mb)  TX bytes:54321234 (567.8 Mb)
          Interrupt:123 

In this case, the output of lmutil lmhostid is '00befebeeffd'.

Most Flex utils newer than 11.11.1.1 can read most of the hostids on confgured Linux, Windows, and MAC systems. If your network cards exist and have 12 hex digit MAC addresses, unless lmutil lmhostid from the same release as the PGI compilers can find them, you cannot use them as hostids for that release of compilers, and compilers that cannot read the hostid when running on the license server.


How do I use a Windows machine as a floating license server?

Even if you have a floating license for PGI's Linux only compilers, you can still use a 32-bit or 64-bit Microsoft Windows based machine as a license server. Node-locked licenses must use the machine running the compilers as the license server.

To use the machine as a license server, download and install the PGI Windows compilers in the default directory, C:\Program Files\PGI.

From the PGI command window, determine the hostid of the license server by typing lmutil lmhostid, and choosing one of the hostids to use when generating your license keys. The hostname can be found by typing uname -n in the PGI command window.

Copy your license keys into C:\Program Files\PGI\license.dat

Go to Start | Control Panel | Administrative Tools | Services, and select PGI License Server and start or restart the license server, or run lmutil lmreread in the PGI command window.

On every machine that runs the compilers set the environment variable $PGROUPD_LICENSE_FILE to port@hostname. For example, with hostname "hal", export PGROUPD_LICENSE_FILE=27000@hal should work.


Why am I seeing the error '$PGI/linux86-64/*.*/bin/lmutil: No such file or directory'?

PGI compilers use FlexNet for license management, which requires your system provide the Linux Standard Base (LSB) ld linker/loader helper library /lib64/ld-lsb-x86-64.so.3 or a symlink to an equivalent such as ld-linux-x86-64.so.2. For most distributions one of the following commands typically fixes the problem:

     sudo ln -s /lib64/ld-linux-x86-64.so.2 /lib64/ld-lsb-x86-64.so.3
       - or -
     sudo ln -s /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /lib64/ld-lsb-x86-64.so.3 
If the problem persists, try installing the full LSB package for your distribution.

Recheck lmutil again and see if things work. If not, try reinstalling the PGI compilers.


Can I check-out one of our network floating licenses to use on the road?

Beginning with PGI Release 7.2, users with network floating license for PGI products can "borrow" a license to use when not connected the license server.

The following example illustrates using this feature under Linux. Operation under OSX and Windows is similar.

To borrow a pgfortran license, using today's date, do the following:

  1. In your shell, enter

    % lmborrow pgroupd 21-may-2015
    lmborrow - Copyright (c) 1989-2015 Acresso Software Inc. and/or
    Acresso Corporation. All Rights Reserved.
    Setting LM_BORROW=21-may-2015:pgroupd:21-may-2015
  2. Compile a prgram using pgfortran:

    % pgfortran hello.f

    A license to use pgfortran is now borrowed. You will need to repeat this process for each compiler and tool that you wish to borrow. Verify that you're borrowing licenses successfully.

    % lmborrow -status
    lmborrow - Copyright (c) 1989-2015 Acresso Software Inc. and/or
    Acresso Corporation. All Rights Reserved.
    Vendor     Feature                             Expiration
    ______     ________                            __________
    pgroupd    pgfortran-lin64                   21-May-15 23:59

    You can now disconnect from the network and use the compiler and tools that you checked out.

    If you wish to return the license before the expiration date, connect to your network and use the lmborrow -return command (see -help for the complete syntax). For example,

    % lmborrow -return pgfortran-lin64

    Again, verify that the license was returned using the lmborrow -status command:

    % lmborrow -status
    lmborrow - Copyright (c) 1989-2007 Acresso Software Inc. and/or
    Acresso Corporation. All Rights Reserved.

For a complete list of commands enter

% lmborrow -help
lmborrow - Copyright (c) 1989-2007 Acresso Software Inc. and/or
Acresso Corporation. All Rights Reserved.

Usage:  lmborrow {all|vendorname} dd-mmm-yyyy [hh:mm]     (To borrow)
        lmborrow -status           (Report features borrowed to this node)
        lmborrow -clear            (Changed your mind -- do not borrow)
        lmborrow -return [-c licfile] [-d display_name] [-fqdn] feature
                                   (Return feature early)
        lmborrow -help             (Display usage information)

The syntax of the check out is:

lmborrow {all|vendorname} dd-mmm-yyyy [hh:mm]

Where all indicates all license vendors and vendorname indicates a specific vendor (e.g., pgroupd for PGI compilers/tools). dd-mmm-yyyy is the date that you anticipate returning the license and [hh:mm] is an optinal return time using 24 hour notation (i.e., 11pm is 23:00). By default the return time is 23:59.


How do I stop the PGI License Service?

To stop the License service on Linux and OS X, locate the lmgrd process and kill it. If the license service is used by other non-PGI products, they will no longer have access as well.

% ps ax | grep lmgrd

 2786 ?        S      0:00 /opt/pgi/linux86-64/16.1/bin/lmgrd -c /opt/pgi/license.dat
18278 pts/0    S+     0:00 grep lmgrd

% kill -9 2786

To stop the license service on a Windows system use Control Panel->Admin Tools->Services->PGI and then right-click on PGI to STOP the service.


My problem is isn't listed here. What do I do now?

If after reading this FAQ you're still having license problems, you may wish to search the PGI Licensing and Installation User Forum. Chances are someone else has run into your problem already. If you don't find an answer to question in the User Forums, or if you have questions about your license status, how to generate new license keys, or how to get your original order confirmation, use the PGI support request form to contact PGI License Support.


I'm new to PGI. Which compiler flags should I be using?

For users interested in compiling their code to run reasonably fast, we provide this list of recommended default flags. For users interested in achieving peak performance, a list of tuning flags follows below.

PGI Recommended Default Flags

Compiler Flags
PGFORTRAN -fast -Mipa=fast,inline
PGCC -fast -Mipa=fast,inline -Msmartalloc
PGC++ -fast -Mipa=fast,inline -Msmartalloc

Where:

-fast A generally optimal set of options including global optimization, SIMD vectorization, loop unrolling and cache optimizations.
-Mipa=fast,inline Aggressive inter-procedural analysis and optimization, including automatic inlining.
-Msmartalloc Use optimized memory allocation (Linux only).

PGI Tuning Flags

Flag Usage
-Mconcur Enable auto-parallelization; for use with multi-core or multi-processor targets.
-mp Enable OpenMP; enable user inserted parallel programming directives and pragmas.
-Mprefetch Control generation of prefetch instructions to improve memory performance in compute-intensive loops.
-Msafeptr Ignore potential data dependencies between C/C++ pointers.
-Mfprelaxed Relax floating point precision; trade accuracy for speed.
-tp=cpua,cpub Create a PGI Unified Binary for two or more cpu types, which functions correctly on and is optimized for two cpus. For example, -tp=sandybridge,bulldozer optimizes for both Intel 'Sandybridge' and AMD 'Bulldozer' cpu types.
-Mpfi/-Mpfo Profile Feedback Optimization; requires two compilation passes and an interim execution to generate a profile.

Please see the PGI Compiler Reference Manual for detailed flag information. Find more specific information for tuning many popular community applications on the Porting & Tuning Guides page.


What are Intrinsics and how do I call them directly?

Inline intrinsic functions map to actual x86 machine instructions. Intrinsics are inserted inline to avoid the overhead of a function call. The compiler has special knowlege of intrinsics, so with use of intrinsics, better code may be generated as compared to extended inline assembly code. Intrinsics are available in C and C++ programs running on Linux or Windows only.

See the PGI Compiler User's Guide, Chapter 15, for more information about intrinsics.


Which versions of the OpenMP specification do you support?

PGI supports through the OpenMP 3.1 specification.


How do I remove the 'FORTRAN STOP' message when 'STOP' is encountered?

Declare the NO_STOP_MESSAGE environment variable and assign it any value.


What do I do when the compiler intermittently fails with SIG interrupt or bad output?

If any of the compilers fail from time to time with a SIGSEGV or SIGNAL 11 interrupt, it could be your temp directory.

A problem like this where a compile has been terminated

% pgfortran x.f90 -fast 
pgf90-Fatal-/usr/pgi/linux86-64/12.0/bin/pgf901 TERMINATED by signal 11
Arguments to /usr/pgi/linux86-64/12.0/bin/pgf901
/usr/pgi/linux86-64/12.0/bin/pgf901 x.f90 -opt 2 -terse 1 -inform warn -nohpf -nostatic 
-x 19 0x400000 -quad -x 59 4 -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c 
-x 58 0x10000 -x 124 0x1000 -x 57 0xfb0000 -x 58 0x78031040 -x 70 0x6c00 -x 47 0x400000 
-x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /usr/pgi/linux86-64/12.0/include:/usr/local/
include:/usr/lib/gcc/x86_64-redhat-linux/4.1.2/include:/usr/lib/gcc/x86_64-redhat-linux/
4.1.2/include:/usr/include -def unix i -def __unix -def __unix__ -def linux -def __linux 
-def __linux__ -def __NO_MATH_INLINES -def __x86_64__ -def __LONG_MAX__=9223372036854775807L 
-def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW= 
-def __extension__= -def __amd64__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ 
-def __SSSE3__  -preprocess -freeform -vect 48 -y 54 1 -x 53 2 -quad -x 119 0x10000000 
-modexport /tmp/pgf90Em9bAdWSVOUt.cmod -modindex /tmp/pgf90om9bQEWzq5Vy.cmdx 
-output /tmp/pgf90Um9bkYHrQ0B4.ilm
              ^^^^^^^^^^^^^^^ 

There are a number of possible reasons why the program seg faults during compilation. Two reasons that often come up are 1) temporary directories ($TMPDIR by default set to /tmp) that are not large enough to handle the intermediate files the compilers create, and 2) not enough stack space (limit stacksize nnnn) for the compilers need.


Why don't the PGI Fortran compilers handle #include statements the same as include statements?

The statements

         #include "filename"
         include "filename" 

are handled differently in pgf77 and pgfortran. #include is a preprocessor statement, while the indented include statement is handled by the front end of the compiler.

To handle files with #include statements, either rename the file from x.f to x.F, or use the switch ‑Mpreprocess in your compile line.


How do I easily call pgcc as cc?

Some users want to keep gcc, and call pgcc as cc. The easiest way to do this is to create a script file named cc and make sure your path is set up to find the cc script file. The script file, using csh syntax, is:

      #!/bin/csh
      setenv PGI /usr/pgi      #! or wherever pgcc is installed
      set path=($PGI/linux86-64/bin $path)
      pgcc $* 

If the PGI environment variable is already set, then delete the setenv command.

We don't recommend renaming pgcc to cc. There are several changes necessary for this to work correctly, and each new release can cause problems due to changes in the driver structure.


Why does the F2003 SIZE intrinsic not handle large arrays properly in 64-bit?

The SIZE intrinsic is set to the same function type as the default INTEGER, which is four bytes with the 64-bit compilers. Compiling with -i8, the default integer size is eight, and the SIZE intrinsic will now be eight bytes.


What can I do about precision problems?

The x86 floating point processor performs all of its computations in extended (80-bit) precision. This may cause problems when porting code to the x86 which successfully executes on other (non-x86) systems. The increased precision of the x86 may result in 'different' answers. Also, the increased precision of the x86 may result in infinite loops if equality tests of floating point data are used to control while loops. Examples of problem cases:


  1.    a = <expression>   ! 'copy propagate' a's right-hand side to its use
       b = a + c          ! 'propagate' b
       if (b .eq. y ) ... ! 'exact equality' check

  2.    while ( C.EQ.ONE )
           LT = LT + 1
           A = A*LBETA
           C = DLAMC3( A, ONE )
           C = DLAMC3( C, -A )
       END WHILE

To reduce the precision, the compilers options ‑pc 64 (round floating point operations to double precision) or ‑pc 32 (round floating point operations to single precision) may be used.

The ‑Kieee switch may be used to disable propagating floating point values and to round the argument values passed to intrinsics (sin, cos, etc.).


How come we get different answers on one platform versus a Linux x86 platform?

The x86 architecture implements a floating-point stack by using 8 80-bit registers. Each register uses bits 0-63 as the significant, bits 64-78 for the exponent, and bit 79 is the sign bit. This extended 80-bit real format used by floating instructions is the default. When values are loaded into the floating point stack they are automatically converted into the extended real format. The precision of the floating point stack can be controlled, however, by setting the precision control bits (bits 8 and 9) of the floating control word appropriately. In this way, the programmer can explicitly set the precision to standard IEEE double or single precision (the Intel documentation, however, claims that this only affects the operations of add, subtract, multiply, divide, and square root.)

We have also noticed that, although extended precision is supposedly the default which is set for the control word, it is set at double precision in the x86 Linux systems. Thus, we now also have a ‑pc <val> option which can be used on the command line. The values of <val> are:

       32 => single precision
       64 => double precision
       80 => extended precision

At first glance, an extra 16 bits of precision appears to only be a positive asset. However, operations that are performed exclusively on the floating point stack, without storing into (or loading from) memory, can cause problems with accumulated values within those 16 bits. This can lead to answers, when rounded, that do not match expected results.

We briefly look at several examples which have been encountered. First, we have recently implemented the evaluation of most transcendental functions inline, such as sin, cos, tan, and log, since there are x86 instructions for their direct computation. However, as an example, if the argument to sin is the result of previous calculations performed on the floating point stack, then an 80-bit value vs. a 64-bit value can result in slight discrepancies in the answer. With our sin example, we have seen results even change sign due to the sin curve being so close to an x-intercept value when evaluated. Consistency in this case can be maintained by calling a function which, due to the ABI, must push its arguments on the stack (in this way memory is guaranteed to be accessed, even if the argument is an actual constant.) Thus, even if the called function simply performs the inline expansion, using the function call as a wrapper to sin has the effect of trimming the argument precision down to the expected size. Using the ‑Mnobuiltin option on the command line for C accomplishes this task by resolving all math routines in the library libm, thus performing a function call of necessity. The other method of generating a function call for math routines, but one which may still produce the inline instructions, is by using the ‑Kieee switch, described below.

A second example which illustrates the precision control problem can be seen by examining this code fragment adapted from the benchmark "paranoia", used to validate IEEE compliance. This section of code is used to determine machine precision:

        program find_precision  
        w = 1.0
100     w=w+w
        y=w+1
        z=y-w
        if (z .gt. 0) goto 100
C       ... now w is just big enough that |((w+1)-w)-1| >= 1 ...
        print*,w
        end

In this case, where the variables are implicitly real*4, operations are performed on the floating point stack where optimization removed unneeded loads and stores from memory. The general case of copy propagation being performed follows this pattern:

         a = x
         y = 2.0 + a

Instead of storing x into a, then loading a to perform the addition, the value of x can be left on the floating point stack and have 2.0 added to it. Thus, memory accesses in some cases can be avoided, leaving answers in the extended real format. If copy propagation is disabled, stores of all left-hand sides will automatically be performed, and reloaded when needed. This will have the effect of rounding any results to their declared sizes.

For the above program, w has a value of 1.8446744E+19 when executed as is (extended precision.) However, if ‑Kieee is set, the value becomes 1.6777216E+07 (single precision.) This difference is due to the fact that ‑Kieee disables copy propagation, so all intermediate results are stored into memory, then reloaded when needed. (Actually, copy propagation is only disabled for floating point operations, not integer,when the ‑Kieee switch is set.) Of course, with this particular example, setting the ‑pc switch will also adjust the result.

The switch ‑Kieee also has the effect of making function calls to all transcendental routines. Although the routine still produces the machine instruction for computation (unless in C the ‑Mnobuiltin switch is set), arguments are passed on the stack, which results in a memory store and load.

The final effect of the ‑Kieee which we discuss is to disable reciprocal division for constant divisors. That is, for a/b with unknown a and constant b, the expression is converted at compile time to a* 1/b, thus turning an expensive divide into a relatively cheap multiplication. However, small discrepancies can again occur, resulting in differences from expected answers.

Thus, understanding and correctly using the ‑pc, ‑Mnobuiltin, and ‑Kieee switches should enable the user to produce the desired and expected precision for calculations which utilize floating point operations.

Note: Current x86 CPUs have SSE1,SSE2, SSE3, etc instruction sets which perform 32-bit and 64-bit floating point operations in a vectorized manner. This greatly reduces any precision discrepancies between x86 and other CPU types.


Why when I execute do I get the error message 'libpgc.so: cannot open shared object file'?

The current Linux releases feature a shared libpgc.so along with libpgthread.so. Before, the libraries libpgc.a and libpgthread.a were used, and they are still available.

The default link will use a shared libpgc and libpghtread, so that users can build one executable for several versions of Linux. Building your application with, for example, pgcc on a RHEL 5.2 system, it can also run on a SuSE 11-3 system. Using gcc, however, the executable created should run on any Linux system with libc installed.

We have done the same thing with libpgc.so and libpgthread.so. If you wish to execute code built on your (for example) Ubuntu 10.04 system, on another RHEL 6.1 system, simply copy libpgc.so and libpgthread.so from $PGI/linux86-64/2012/REDIST to the target system, and add the directory you placed it in to the LD_LIBRARY_PATH path environment variable.

As an example, if we build 'hello world' on a RHEL 6.0 system

% more hello.c
main(){printf("hello\n");}
% pgcc -o hi hello.c
% hi
hello

To run this program on platform B, which is Ubuntu 11.04

% rcp hi  B:/your_B_dir/hi   ! copy executable to B
% rcp $PGI/linux86-64/2012/REDIST/*.so B:/tmp/.
% rsh B 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/tmp
(note if "rsh 'echo $LD_LIBRARY_PATH'" indicates it is not defined, use
% rsh B 'export LD_LIBRARY_PATH=/tmp' )    ! update dynamic lib path on B
% rsh B '/your_B_dir/hi' ! run executable on B
hello

If libpgc.so does not exist in $PGI/linux86-64/2012/libso, the linkage will be performed on the $LD_LIBRARY_PATH contents, if the shared library is present.


Do you have any relative performance numbers?

While performance is a very important reason for using the PGI compilers, typically we do not publish any relative performance numbers. Performance depends upon too many factors to make a credible claim that we are N% faster than a competitor. The only true measure is how your application performs on your system. Please download the PGI Community Edition and try it out.

Two organizations that do publish performance results are Standard Performance Evaluation Corporation (SPEC) and Polyhedron. Polyhedron also allows you to download the benchmark source code so you can do your own performance comparison. Again, when looking at these results remember that the only true benchmark is your application.


Do you have an example of using -byteswapio?

Here is an example of using the -byteswapio switch.

Example

% more rtest.f
      program test
      real*4 ssmi
      OPEN(UNIT=10,FILE='ice.89',FORM='UNFORMATTED')
      read(10) ssmi
      print *,'OK: ',ssmi
      end

% more wtest.f
      program test
      real*4 ssmi
      ssmi = -999
      OPEN(UNIT=10,FILE='ice.89',FORM='UNFORMATTED')
      write(10) ssmi
      print *,'OK: ',ssmi
      end

On your Sun workstation (or other big-endian device)

f77 -o w_sparc wtest.f
f77 -o r_sparc rtest.f

On your PGI workstation.

pgfortran -o w86 wtest.f
pgfortran -o w86_swap -byteswapio wtest.f
pgfortran -o r86 rtest.f
pgfortran -o r86_swap -byteswapio rtest.f

------------------------------------------
If you write the file  |  Then read the file
  ice.89 with          |  ice.89 with

  w_sparc or w86_swap  |  r_sparc or r86_swap

    w86                |  r86


Does the License Manager allow me to execute on other platforms?

Executables created by PGI compilers are not licensed. There are no license requirements for executables created on any platform. If you compiled with a trial license, the executable will stop functioning after the trial period expires. To prevent this, you will need to recompile codes after permanent license keys are installed.


Why does read(9,rec=recnr,end=100,err=101,iostat=ios) act differently on other compilers?

For statements like

    read(9,rec=recnr,end=100,err=101,iostat=ios) buf

some machines will jump to 100 and return IOSTAT of -1 upon getting to the end of the file, while other machines go to 101 or err exits.

The f77 & f90 standards distinguish between 'error conditions' and 'end-of-file'. The 'correct' (standard conforming, portable) way of writing the read statement to capture 'errors' and 'end-of=file' like the following:

    read(card(iarg:),*,err=701,end=701) iskip, irec1

It is true that the SGI treats an end-of-file condition as an error condition if the ERR= specifier is present and the END= specifier is not present. However, this behavior is inconsistent across systems (for example, HP & g77 both abort execution and report an end-of-file).

Another test case shows another inconsistency in various implementations. Consider this test:

        open(unit=10,file='foo',form='unformatted')
        read(10, err=99, iostat=ios) yy
        print *, 'fail1', ios
        stop
99      continue
        print *, 'fail2', ios
        end

According to the standards, if the 'err' branch is taken, the iostat variable will be defined with a positive value. Given that the SGI takes the ERR= branch in the original example, this test should take the ERR= branch as well. But on the SGI, this test executes as:

 fail1          -1

[NOTE that -1 => end-of-file]

The DEC alpha is another system where the ERR= branch is taken for the original example. But, the test above executes as:

 fail2          -1

But in this case, ‑iostat shouldn't be negative since the ERR= branch was taken.

The point of all this is that there are inconsitencies in the way ERR= is handled given an 'end-of-file' condition. Adding the END= specifier to your example guarantees consistent behavior across 'all' systems.


(32-bit Linux compilers) Why can't my compiled code handle even half of the 2GB of memory in my system?

Users now are capable of buying machines with > 4GB of memory in them, so they expect to be able to declare very large arrays. Most understand that the accessible limit ought to really be 2GB for a 32-bit addressable system, when you assume that signed ints may be involved in libraries that work with addresses.

Here are some things we have learned, from users who were more familiar with Linux.

  1. The Linux kernel places shared libraries at 0x40000000 by default,so on x86 you have only about 1GB total for your program code and other elements you provide. It has nothing to do with gcc.

    Possible solutions: (a) link statically, such that there aren't any shared libraries. Or (b) use malloc() to allocate the arrays. That should give you about 3GB total (but note that malloc() can't allocate a single chunk larger than 2GB).

         -Wl,-Bstatic

    will force a static link.

  2. If you wish to modify the kernel, in the kernel source, in file mmap.c, there is a line that reads:

         addr = TASK_UNMAPPED_BASE;

    This is what sets the default address of the shared libs in the memory mapping, and it's at 0x40000000 (1G) by default. So change it to, for example:

         addr = 0x80000000;

    And you should, in theory, have up to ~2GB to use for the codes.

  3. For more info on this, check out the comp.os.linux.development.system newsgroup.

Bottom line is

  1. it is an OS problem, not a compiler problem.
  2. Sometimes, you may have to do a lot of work on Linux to use all of your memory.

When executing, why do I get a stack overflow on Windows?

The error exhibits itself as either stack overflow or sometimes the program just hangs.

To enlarge the stack space, edit the driver file C:\Program Files\PGI\win32\[RELEASE#]/bin/win32rc, and change the line

  LDARGS=""

to something like

  LDARGS="-stack 10000000,50000"

which will enlarge the stack area (maximum size, commit size) of the executable. Relink your application and execute.


When I compile with -Ktrap=inexact, the program gets many exceptions

The PGI compilers do not support exception free execution for ‑Ktrap=inexact. The purpose of the hardware support is for those who have specific uses for its execution, along with the appropriate signal handlers for handling exceptions it produces. It is not meant for normal floating point operation code support.


Why do I get 'illegal instruction' exceptions when I run the program on another platform?

This usually indicates that you either have a CPU that does not support the assembly instruction the compiler generated, probably because the code was generated for a newer CPU than is currently executing. You can force the compiler to choose to generate code for an older CPU, or you can generate a PGI Unified Binary executable that combines a new CPU type with an older CPU type for everywhere else.

The instruction

  pgpfortran -V

will output the type of CPU that the PGI compilers determine your machine has.


Why to I get different answers with -r8 set when I declared all my variables as REAL*8?

The Fortran standard does not define the default size of constants. Our compiler treats constants as REAL*4 unless you compile with ‑r8. So a program like

       real*8  x,y,z
       x=50453.61
       y=29581.28
       z=x*y
       write(*,10)x,y,z, 50453.61*29581.28
 10    format(4f20.5)
       end

will produce different answers with ‑r8 set. Code that will treat constants as REAL*8 everywhere should be written as

       real*8  x,y,z
       x=50453.61D0
       y=29581.28D0
       z=x*y
       write(*,10)x,y,z, 50453.61D0*29581.28D0
 10    format(4f20.5)
       end


How do I detect infs and NaNs in pgfortran?

REAL*4 and REAL*8 numbers have a specific format for their exponent and mantissa. Exponents larger than the format limits are printed as inf. 32-bit and 64-bit numbers with mantissas falling outside the boundaries of the ieee formats are printed as NaN. Overflows or division by zero can result in inf values, while illegal operations such as sqrt(-1.) and log(-10.0) cause NaN.

The x86-64 architecture has status bits set by floating point operations. pgfortran traps some of those status bits ensuring that they are not set as a side effect of compiler generated code. Specifically, the status bits ovf for overflow, divz for divide by zero and inv for invalid. Other status bits, including inexact, align, unf, and denorm are not guaranteed to not be set as a side effect of normal compiler operation.

Some examples follow. To create infs and NaNs at execution time, you may need to code in such a way as to prevent the compiler from catching your improper operation.

program infnan
  real(8) v,w,x,y,z
  v = -1.0d0
  w = 123456789.0d0
  x = v * 10.d0 * log10(v)
  y= exp(w)
  z=1.0d0 /(v + 1.0d0)
  print *,'x (inv ?) =',x
  print *,'y (ovf ?) =',y
  print *,'x (divz?) =',z
end program infnan

% pgfortran -o infnan infnan.f90
% ./infnan
 x (inv ?) =                       NaN
 y (ovf? ) =                       Inf
 x (divz?) =                       Inf

The F2003 standard states the status bits set will be printed after a STOP statement.

program infnan2
  real(8) v,w,x,y,z
  v = -1.0d0
  w = 123456789.0d0
  x = v * 10.d0 * log10(v)
  y= exp(w)
  z=1.0d0 /(v + 1.0d0)
  print *,'x (inv ?) =',x
  print *,'y (ovf? ) =',y
  print *,'x (divz?) =',z
  stop
end program infnan2

% pgfortran -o infnan2 infnan2.f90
% ./infnan2
 x (inv ?) =                       NaN
 y (ovf? ) =                       Inf
 x (divz?) =                       Inf
Warning: ieee_invalid is signaling
Warning: ieee_divide_by_zero is signaling
Warning: ieee_inexact is signaling
FORTRAN STOP

If you wish the program to terminate when a operation causes a status bit change, compile with the following switches: ‑Ktrap=ovf,unv,divz or its shorthand ‑Ktrap=fp. You can also set the environment variable PGI_TERM to determine what happens when the status bit are set. You can even invoke the PGI debugger upon the event.

% pgfortran -o infnan infnan.f90 -Ktrap=divz -g
% ./infnan
Floating exception

% setenv PGI_TERM 'signal'
% ./infnan
Error: floating point exception, integer divide by zero

% setenv PGI_TERM 'signal,debug'
% ./infnan
Error: floating point exception, integer divide by zero
PGDBG 14.1-0 x86-64 (Cluster, 256 Process)
The Portland Group - PGI Compilers and Tools
Copyright (c) 2014, NVIDIA CORPORATION.  All rights reserved.
Loading symbols from /my/dir/infnan ...
Loaded: /my/dir/infnan
Stopped at 0x7F229025219A, function __waitpid
0x7F229025219A:  48 3D 0 F0 FF FF       cmpq   $0xFFFFFFFFFFFFF000,%rax

pgdbg> file /my/dir/infnan.f90
"infnan.f90"
pgdbg> list
 #1:     program infnan
 #2:       real(8) v,w,x,y,z
 #3:       v = -1.0d0
 #4:       w = 123456789.0d0
 #5:       x = v * 10.d0 * log10(v)
 #6:       y= exp(w)
 #7:       z=1.0d0 /(v + 1.0d0)
 #8:       print *,'x (inv ?) =',x
 #9:       print *,'y (ovf? ) =',y
 #10:      print *,'x (divz?) =',z

pgdbg>

If you want to determine if any element of a REAL array is a NaN, use the IEEE_ARITHMETIC routine ieee_is_nan(x) which takes real arguments, and because it is elemental, we can feed the entire array to it.

program infnan3
  use,intrinsic::IEEE_ARITHMETIC
  real(8) x(1000),y
  integer i
  do i=1,1000
     x(i)=i
  end do

  y = -1.0d0
  x(500) = y * 10.d0 * log10(y)

  if(any(ieee_is_nan(x))) then 
     print *, "we found a NaN!"
  else 
    print *, "we found NO NaNs!"
  end if
end program infnan3


% pgf90 -o infnan3 infnan3.f90
% ./infnan3
 we found a NaN!

With this technique, we can find whether an array has one or more NaNs, but not the index of the failing element. To do that, we need to use a loop.

program infnan4
  use,intrinsic::IEEE_ARITHMETIC
  real(8) x(1000),y
  integer i
  do i=1,1000
     x(i)=i
  end do

  y = -1.0d0
  x(500) = y * 10.d0 * log10(y)
  x(600) = y /(y + 1.0d0)
  do i=1,1000
   if(ieee_is_nan(x(i))) then
      print *,"X(",i,") is a NaN!"
   endif
   if(.not.(ieee_is_finite(x(i)))) then
      print *,"X(",i,") is an inf!"
   endif
   end do
end program infnan4

% pgfortran -o infnan4 infnan4.f90
% ./infnan4
 X(          500 ) is a NaN!
 X(          500 ) is an inf!
 X(          600 ) is an inf!


Why does date_and_time(DATE) return 20 for the century?

Century is just defined to be a group of 100 years. For 2012, date_and_time(DATE) sets the century and year values in DATE to 20 and 12, respectively. So, for example, 2012 = 20*100 + 12

Here is a Fortran program that uses date_and_time().

program testread
  implicit none
  integer::cen,year,mon,day
  character(len=8) :: date
  call date_and_time(date)
  read(date,'(4i2)')cen,year,mon,day
  print *,cen,year,mon,day
end program

What should I consider when building executables to run on other versions of Linux?

To build executables for portability across multiple Linux distributions, procedures can be linked statically at build time into the executable on the build machine, or they can be linked dynamically at runtime on the target machine. If you link dynamically, you may need to carry the procedure libraries over to the target machine if versions don't already exist there in a location where the executable expects to find them.

  • Compile for the oldest x86 CPU you wish to support, or create a PGI Unified Binary. For example, compiling with the options –⁠tp=p7,nehalem will generate p7 type code for older CPUs as well as code optimized for the Nehalem-class CPUs and newer.
  • Both PGI and gcc libraries try to be forward compatible. If your executable builds on the oldest Linux platform and the oldest gcc version you wish to support, it has a good chance of working on newer Linux distributions and gcc versions. If the span between the oldest and the newest Linux/PGI/gcc versions is very large, changes in your code can result in problems. Header file and procedure interfaces could have incompatibilities requiring a narrower range of software versions to function properly.
  • You may either copy files in the PGI REDIST directory to the new target system you wish to run your executable, or you can link with –⁠Bstatic_pgi switch to ensure that all PGI-specific routines are linked statically, and thus are part of the executable. –⁠Bstatic_pgi allows system and libc routines to link dynamically so that the versions of those routines residing on the target system will be used at execution.
  • The performance of OpenMP codes can sometimes be improved by using the libnuma library, which is linked in by default if properly installed on your system. If you have OpenMP directives in your code, compile with both -mp and -mp=nonuma and compare performance. If you wish to build OpenMP executables that will run on machines with and without the libnuma library present, build the executable with -mp=nonuma (or -nomp).
  • Use the ldd command along with your executable's name (ldd foo.exe for example) on the build machine to determine where your executable expects to find the dynamically linked libraries on the target machine. You can preempt these locations by specifying a directory or directories using the $LD_LIBRARY_PATH environment variable on the target system.

What should I consider when building executables to run on other versions of Windows?

To build executables for portability across multiple Windows platforms, consider the following.

  • Compile for the oldest x86 CPU you wish to support, or create a PGI Unified Binary.
  • Compile on the oldest Windows version, to take advantage of the forward compatibility.
  • Note that prior to Windows 7, PGI compilers use a different tool chain (assemblers, linkers and runtime libraries collectively known as the Windows SDK) than used with Windows 7 and newer. PGI does not support executables built with the older pre-Windows 7 SDK working on Windows 7 and newer versions.

What should I consider when building executables to run on other versions of OS X?

To build executables for portability across multiple OS X platforms, consider the following.

  • Compile for the oldest x86 CPU you wish to support, or create a PGI Unified Binary.
  • Compile on the oldest OS X and Xcode versions, to take advantage of the forward compatibility.
  • Xcode version 4.3 and later use Clang. PGI executables built with non-Clang based Xcode versions are not supported on systems where PGI uses Clang based Xcode versions.

Will every program I build run everywhere?

No. Much depends on what system you build it on, and how much your program uses system routines that have changed from OS release to OS release. To reduce the number of porting issues, we recommend you replace system routine calls with calls to standard Fortran, C, and C++ procedures which are available in every version of a standard-compliant compiler.


With pgcc/pgc++, I have trouble opening binary files on Windows, but not Linux. Why?

To port C code containing reads/writes of binary files to Windows, follow these steps:

  • for open(), include O_BINARY in the second argument (oflag) passed to open(). For example:

    fd = open( "FILE", O_CREAT|O_BINARY );

  • for fopen(), include 'b' in the second argument (mode) passed to fopen() like this:

    fp = fopen( "FILE", "wb" );

I'm seeing "lexical error--unknown token type" when I read text files created on Windows from programs compiled on Linux. Why?

Text files created on Windows are different from text files on Linux. On Windows, lines are terminated by "\r\n" , while on Linux, lines are terminated with "\n", and "r" is treated like an ordinary character. There are two ways to handle this:

For reading Windows text files on Linux from pgfortran compiled files, set the environment variable FORTRANOPT to crlf, and the characters will be read correctly.

export FORTRANOPT="crlf"

Change the file. The utility dos2unix can change the format from the Windows format to the Linux format:

dos2unix windows_format_file.txt

Are there licensing issues with running executables on other platforms?

Executable files built with any PGI license do NOT require a separate license for distribution (a/k/a a runtime license). Any files or libraries located in the PGI REDIST directory can be distributed for use with PGI compiled applications, within the provisions of PGI End-user License Agreement.

Executable files created using a FlexNet-based PGI license are permanent and perpetual (do not expire), and may be moved to other systems. Executable files created using a PGI Developer license are also permanent and perpetual but are restricted to running only on the same system as they were compiled. Executable files created using PGI temporary licenses (starter, trial, demo) are time limited. The time limit can be removed by recompiling with a permanent license.


How do you call C and C++ routines from pgfortran(ISO_C_BINDING)?

The answer is ISO_C_BINDING, a standard for Fortran-to-C interlanguage communication. Properly used, ISO_C_BINDING should work across many current Fortran compilers. However, it is limited to passing arguments to C routines, not C++. Be sure that the 'C' entry points are present in any C++ routines that are called (extern "C").

The data types in Fortran and C are different. Different Fortran compiler may treat what looks like the same data type (integer, real, pointer) differently than C and C++.

ISO_C_BINDING allows you to declare interfaces that target C routines and data types. This means you don't need to modify the C code you want to call from Fortran. In short, ISO_C_BINDING lets you treat C code as if it were just another Fortran routine.

For example, to call this C subroutine

more csub_a.c: 

#include <stdio.h>
void foo(int *int_arg1, float *real_arg2)
{
  printf("The arguments passed are int_arg1=%c\n and real_arg2=%f \n",*int_arg1, *real_arg2);
 
}

from this Fortran main

more fmain_a.f90:
             
PROGRAM FORTRAN2C
INT_ARG1=1234
REAL_ARG2=5678.9
PRINT *,"passing integer ",INT_ARG1," and real ",REAL_ARG2
CALL FOO(INT_ARG1,REAL_ARG2)
END

you would write a Fortran interface record like this

added interface lines:

INTERFACE
    SUBROUTINE FOO(INT_ARG1, REAL_ARG2) BIND(C, NAME="foo")
    USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT,C_FLOAT
    IMPLICIT NONE
    INTEGER(C_INT) :: INT_ARG1
    REAL(C_FLOAT) :: REAL_ARG2
  END SUBROUTINE FOO
END INTERFACE


This results in
              

PROGRAM FORTRAN2C
INTERFACE
    SUBROUTINE FOO(INT_ARG1, REAL_ARG2) BIND(C, NAME="foo")
    USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT,C_FLOAT
    IMPLICIT NONE
    INTEGER(C_INT) :: INT_ARG1
    REAL(C_FLOAT) :: REAL_ARG2
  END SUBROUTINE FOO
END INTERFACE
INT_ARG1=1234
REAL_ARG2=5678.9
PRINT *,"passing integer ",INT_ARG1," and real ",REAL_ARG2
CALL FOO(INT_ARG1,REAL_ARG2)
END

We can then link successfully with any C object.

pgcc -c csub_a.c -o csub_a_pgi.o
gcc -c csub_a.c -o csub_a_gcc.o
icc -c csub_a.c -o csub_a_intel.o

pgfortran -o f2c_a_pgi  fmain_a.f90  csub_a_pgi.o
pgfortran -o f2c_a_gcc fmain_a.f90 csub_a_gcc.o
pgfortran -o f2c_a_intel fmain_a.f90 csub_a_intel.o

./f2c_a_pgi
             
 passing integer          1234  and real     5678.900    
The arguments passed are int_arg1=1234
 and real_arg2=5678.9 
              
./f2c_a_gcc
              
 passing integer          1234  and real     5678.900    
The arguments passed are int_arg1=1234
 and real_arg2=5678.9 
              
./f2c_a_intel
              
 passing integer          1234  and real     5678.900    
The arguments passed are int_arg1=1234
 and real_arg2=5678.9

Here's another example with more data types passed.

more fmain_b.f90 

       program fort2c
use fmain2csub
       logical*1              bool1
       character              letter1
       integer*4              numint1, numint2
       real                   numfloat1
       double precision       numdoub1
       integer*2              numshor1
call cfunc (bool1, letter1, numint1, numint2, & 
        numfloat1, numdoub1, numshor1)
write( *, 100) & 
        bool1, letter1, numint1, numint2, numfloat1, & 
        numdoub1, numshor1
100    format(1x,"bool1     =  ", L2,/, &          
        " letter1   =  ", A2,/,                 &
        " numint1   = ", I5,/,                  &
        " numint2   = ", I5,/,                  &
        " numfloat1 = ", F6.1,/,                &
        " numdoub1  = ", F6.1,/,                &
        " numshor1  = ", I5,/)
       end

              
more csub_b.c
              
#include <stdio.h>
#define TRUE 0xff
#define FALSE 0
void
cfunc( bool1, letter1, numint1, numint2, numfloat1,\
        numdoub1, numshor1)
   char    *bool1, *letter1;
   int     *numint1, *numint2;
   float   *numfloat1;
   double  *numdoub1;
   short   *numshor1;
{
   *bool1 = TRUE;
   *letter1 = 'v';
   *numint1 = 11;
   *numint2 = -44;
   *numfloat1 = 39.6 ;
   *numdoub1 = 39.2 ;
   *numshor1 = 981;

}

This time we'll create an interface record and put it in a separate module called fmain2csub_b

more fmain2csub_b_mod.f90
              
module fmain2csub_b
INTERFACE
    subroutine cfunc ( bool1, letter1, numint1, &
    numint2, numfloat1, numdoub1, numshor1) BIND(C,NAME="cfunc")
    use, intrinsic  :: iso_c_binding, only:C_CHAR,C_BOOL, &
        C_INT,C_FLOAT,C_DOUBLE,C_SHORT
    logical(C_BOOL) ::      bool1
    character(C_CHAR) ::    letter1
    integer(C_INT) ::       numint1, numint2
    real(C_DOUBLE) ::       numdoub1
    real(C_FLOAT) ::        numfloat1
    integer(C_SHORT) ::     numshor1
    end subroutine cfunc
END INTERFACE
end module fmain2csub_b

When we build with an interface module (like fmain2csub_b) we need to link in the interface module object file as well.

pgcc csub_b.c  -c  -o csub_b_pgi.o
gcc csub_b.c -c -o csub_b_gcc.o
icc csub_b.c -c -o csub_b_intel.o


pgfortran -c fmain2csub_b_mod.f90
pgfortran -o f2c_b_pgi fmain_b.f90 csub_b_pgi.o fmain2csub_b_mod.o
pgfortran -o f2c_b_gcc fmain_b.f90 csub_b_gcc.o fmain2csub_b_mod.o
pgfortran -o f2c_b_intel fmain_b.f90 csub_b_intel.o fmain2csub_b_mod.o
    
f2c_b_pgi
              
 bool1     =   T
 letter1 =  v
 numint1 =  11
 numint2 = -44
 numfloat1 = 39.6
 numdoub1  =   39.2
 numshor1  =   981
    
              
f2c_b_gcc
              
 bool1     =   T
 letter1   =   v
       numint1    = 11
       numint2    = -44
       numfloat1  = 39.6
 numdoub1  =   39.2
 numshor1  =   981
  
              
f2c_b_intel
              
 bool1     =   T
 letter1   = v
 numint1   = 11
 numint2   = -44
 numfloat1 = 39.6
 numdoub1  =   39.2
 numshor1  =   981

Here is the same main program and interface module but this time with an equivalent C++ routine. Note the use of -pgcpplibs in the link step.

more cppsub_b.cpp
              
#define TRUE 0xff
#define FALSE 0
extern "C" {
extern void
cfunc(char *bool1, char *letter1, int *numint1,int *numint2, float *numfloat1,
        double *numdoub1, short *numshor1)
{
   *bool1 = TRUE;
        *letter1 = 'v';
        *numint1 = 11;
        *numint2  = -44;
   *numfloat1 = 39.6 ;
   *numdoub1 = 39.2 ;
   *numshor1 = 981;

}
}

              
g++ -c cppsub_b.cpp -o cppsub_b_g++.o
icc -c cppsub_b.cpp -o cppsub_b_intel.o
pgcpp -c cppsub_b.cpp -o cppsub_b_pgi.o
pgfortran -c fmain2csub_b_mod.f90
pgfortran -o f2cpp_b_pgi fmain_b.f90 cppsub_b_pgi.o fmain2csub_b_mod.o -pgcpplibs
pgfortran -o f2cpp_b_g++ fmain_b.f90 cppsub_b_g++.o fmain2csub_b_mod.o -pgcpplibs
pgfortran -o f2cpp_b_intel fmain_b.f90 cppsub_b_intel.o fmain2csub_b_mod.o -pgcpplibs
f2cpp_b_pgi
              
 bool1     =   T
 letter1   =   v
 numint1   =    11
 numint2   =   -44
 numfloat1 =   39.6
 numdoub1  =   39.2
 numshor1  =   981

              
f2cpp_b_g++
              
 bool1     =   T
 letter1   =   v
 numint1   =    11
 numint2   =   -44
 numfloat1 =   39.6
 numdoub1  =   39.2
 numshor1  =   981

              
f2cpp_b_intel
              
 bool1     =   T
 letter1   =   v
 numint1   =    11
 numint2   =   -44
 numfloat1 =   39.6
 numdoub1  =   39.2
 numshor1  =   981
 

How do you call Fortran routines from a C/C++ routine?

Interfacing to Fortran subroutines and functions is very similar to what we did with ISO_C_BINDING, but there is no equivalent standard defined. When calling Fortran routines from C/C++, both the called Fortran program and the calling C/C++ programs need to be altered. The steps involve:

  • Create a C/C++ prototype for the Fortran program that defines the CALLED Fortran program's entry point and its arguments in the equivalent C/C++ data types. The entry point is usually the function/subroutine name in all lower case, with an appended underscore.
  • Use ISO_C_BINDING to declare the Fortran arguments as compatible C arguments. Instead of an interface record, you are defining the entry arguments to use the ISO_C_BINDING data types.
  • Initialize pgfortran internal tables. This is specific to pgfortran. Call pghpf_init() from the user's main program and any subsequent pgfortran interaction should be consistent.

Suppose you wish to call the following function in Fortran:

more forts_c.f90
                
subroutine forts ( bool1, letter1, numint1, numint2, numfloat1, numdoub1, numshor1 )
logical*1        :: bool1 
character*1      :: letter1 
integer          :: numint1 
integer          :: numint2 
real             :: numfloat1 
double precision :: numdoub1
integer*2        :: numshor1 
    bool1 = .true.
    letter1="v"
    numint1=123
    numint2=-456
    numdoub1=5432.1
    numfloat1=6789.0
    numshor1=53
    return
end

And you wish to call the Fortran routine from a C++ main program like the following:

more cmain_c.C
                
#include <iostream>
int main(int argc, char **argv)
{
   char          bool1;
   char          letter1;
   int           numint1, numint2;
   float         numfloat1;
   double        numdoub1;
   short         numshor1;
   int           i;
   for (i=0; i < argc; i++){
       std::cout << "main: command line arg " << i << " is " << argv[i] << std::endl;
   }
   forts(&bool1,&letter1,&numint1,&numint2,&numfloat1,
          &numdoub1,&numshor1);
   std::cout << "main: bool1=     " << (bool1?"TRUE":"FALSE") << std::endl;
   std::cout << "main: letter1=   " << letter1  << std::endl;
   std::cout << "main: numint1=   " << numint1  << std::endl;
   std::cout << "main: numint2=   " << numint2  << std::endl;
   std::cout << "main: numfloat1= " << numfloat1  << std::endl;
   std::cout << "main: numdoub1=  " << numdoub1  << std::endl;
   std::cout << "main: numshor1=  " << numshor1  << std::endl;
}

To make this work, we need to add/modify the following lines in each source file.

more cmain_c.C
                
 
#include <iostream>
                
extern "C" { 
extern void forts_( char *, char *, int *,int *,float *, double *,short * );
}
                
                
#if defined (_PGI_)
extern "C" void pghpf_init(int *);
static int zz = 0;
#endif
                
int main(int argc, char **argv)
{
   char          bool1;
   char          letter1;
   int           numint1, numint2;
   float         numfloat1;
   double        numdoub1;
   short         numshor1;
   int           i;
                
#if defined (_PGI_)
    pghpf_init(&zz);
#endif
                
   for (i=0; i < argc; i++){
       std::cout << "main: command line arg " << i << " is " << argv[i] << std::endl;
}
                
   forts_(&bool1,&letter1,&numint1,&numint2,&numfloat1,
          &numdoub1,&numshor1);
                
   std::cout << "main: bool1=     " << (bool1?"TRUE":"FALSE") << std::endl;
   std::cout << "main: letter1=   " << letter1  << std::endl;
   std::cout << "main: numint1=   " << numint1  << std::endl;
   std::cout << "main: numint2=   " << numint2  << std::endl;
   std::cout << "main: numfloat1= " << numfloat1  << std::endl;
   std::cout << "main: numdoub1=  " << numdoub1  << std::endl;
   std::cout <<"main: numshor1=  " << numshor1  << std::endl;
}
more forts_c.f90
                
subroutine forts ( bool1, letter1, numint1, numint2, numfloat1, numdoub1, numshor1 )
                
use, intrinsic  :: iso_c_binding, only:C_CHAR,C_BOOL, &
        C_INT,C_FLOAT,C_DOUBLE,C_SHORT
    logical(C_BOOL) ::      bool1
    character(C_CHAR) ::    letter1
    integer(C_INT) ::       numint1, numint2
    real(C_DOUBLE) ::       numdoub1
    real(C_FLOAT) ::        numfloat1
    integer(C_SHORT) ::     numshor1
                
bool1 = .true.
letter1="v"
numint1=123
numint2=-456
numdoub1=5432.1
numfloat1=6789.0
numshor1=53
return
  end


                
pgfortran -c forts_c.f90
pgcpp -o c2f_pgi cmain_c.C forts_c.o -pgf90libs
 
./c2f_pgi
                
c2f_pgi is a program
main: command line arg 0 is c2f_pgi
main: command line arg 1 is is
main: command line arg 2 is a
main: command line arg 3 is program
main: bool1=     TRUE
main: letter1=   v
main: numint1=   123
main: numint2=   -456
main: numfloat1= 6789
main: numdoub1=  5432.1
main: numshor1=  53

or

                
ifort -c forts_c.f90
icc -o c2f_intel cmain_c.C forts_c.o 

./c2f_intel
                
c2f_intel is another program
main: command line arg 0 is c2f_intel
main: command line arg 1 is is
main: command line arg 2 is another
main: command line arg 3 is program
main: bool1=     TRUE
main: letter1=   v
main: numint1=   123
main: numint2=   -456
main: numfloat1= 6789
main: numdoub1=  5432.1
main: numshor1=  53
 

Can you link programs compiled with pgcpp and programs compiled with g++?

pgcpp has not been able to link with other C++ object files, because pgcpp uses a different name-mangling algorithm. That has changed with the 13.* release compilers. A new compiler, pgc++, uses the same algorithm as g++.

For example, here is a C++ main routine.

more  cppmain_d.cc
                
#include <iostream>
extern double do_triad(double *a, double *b, double *c, double *d, int len, int rep);
using namespace std;
    
int main(int argc, char** argv) {
 
const int length=10000;
  
double *a = new double[length];
double *b = new double[length];
double *c = new double[length];
double *d = new double[length];
    
for(int i=0; i< length; ++i)
a[i]=b[i]=c[i]=d[i]=1.0;
 
do_triad(a,b,c,d,length,2);
    
delete [] a;
delete [] b;
delete [] c;
delete [] d;

return 0;
}

And here is a procedure it calls

more  cppsub_d.cc
                
#include <iostream>
using namespace std;


double do_triad(double *a, double *b, double *c, double *d,
int len, int rep) {
int i,j;
{
for(j=0;j<rep;j++){
for(i=0;i<len;++i)
a[i]=b[i]+c[i]*d[i];
cerr << j << " of " << rep << endl;
}
}
return 0.0;
}

With pgcpp linking fails, but with pgc++ things are better.

g++ -c cppsub_d.cc
pgcpp -o pgcpp2g++  cppmain_d.cc cppsub_d.o
                
cppmain_d.cc:
cppmain_d.o: In function `main':
/home/tull/xfer/13.0/examples/C2C/./cppmain_d.cc:17: undefined reference to `do_triad__FPdN31iT5'
cppsub_d.o: In function `do_triad(double*, double*, double*, double*, int, int)':
cppsub_d.cc:(.text+0x9b): undefined reference to `std::cerr'
cppsub_d.cc:(.text+0xa0): undefined reference to `std::ostream::operator<<(int)'
cppsub_d.cc:(.text+0xad): undefined reference to `std::basic_ostream<char, std::char_traits
<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostreami
<char, std::char_traits<char> >&, char const*)'
cppsub_d.cc:(.text+0xba): undefined reference to `std::ostream::operator<<(int)'
cppsub_d.cc:(.text+0xbf): undefined reference to `std::basic_ostream<char, std::char_traits<
char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream
<char, std::char_traits<char> >&)'
cppsub_d.cc:(.text+0xc7): undefined reference to `std::ostream::operator
<<(std::ostream& (*)(std::ostream&))'
cppsub_d.o: In function `__static_initialization_and_destruction_0(int, int)':
cppsub_d.cc:(.text+0x113): undefined reference to `std::ios_base::Init::Init()'
cppsub_d.cc:(.text+0x118): undefined reference to `std::ios_base::Init::~Init()'


                
pgc++ -o pgc++2g++ cppmain_d.cc cppsub_d.o 
cppmain_d.cc:

                
./pgc++2g++ 
0 of 2
1 of 2
Click me