PGI Release 2016 version 16.4 and newer include updated FlexNet license daemons (version 11.13.1.3). This update addresses a FlexNet security vulnerability. We recommend all users update their license daemons. See the FlexNet Update FAQ for more information. This update also requires you to update your PGI license keys to a new format. Older keys are incompatible.

What type of licenses are available?

There are six types of licenses (actually license keys) provided with PGI compilers.

  • Starter license, provided by PGI.
  • Trial license, created by a user.
  • Demo license, provided by PGI.
  • Developer license, created by a user.
  • Community License, provided by PGI.
  • Node-locked license, created by a user.
  • Network floating license, created by a user.

The first three licenses are temporary and in most cases work with any PGI release from version 7.2 newer. All PGI temporary licenses work with any PGI product running on any supported operating system.

  • A Starter license has a 30 day time limit. It also introduces a 30 day time limit in the executable. Unless codes compiled with a starter license are recompiled with a permanent or demo license, they will stop working. Typically, starter licenses are only provided to new users one time. Users can generate a trial license after the starter license expires. The Starter license works regardless of whether the FlexNex license manager (lmgrd) is running.
  • The Trial license has a 15 day time limit. It also introduces a 15 day time limit in the executable. Unless codes compiled with a trial license are recompiled with a permanent or demo license, they will stop working. Under the terms of the PGI evaluation license agreement, each account is permitted to generate one 15 day trial license every six months, or whenever a new release comes out. The Trial license acts much like the FlexNet type license in that it requires a hostid.
  • The Demo license will work regardless of whether the FlexNet license manager (lmgrd) is running, and it does not introduce a time limit. Typically, demo licenses are provided by PGI only to permanent PGI license owners when their permanent licenses have problems.
  • Developer license is valid for one year from the date of issue. Developer license restrict compiled executable files to running only on the same system as they were compiled. Developer licenses are like node locked licenses, where only the machine running the license service ican run the compilers. Unidev licenses uses the Flexnet license manager.
  • Community Licenses are valid for one year from their date of issue and is not FlexNet managed. Community license are bundled with most PGI download packages. There are no restrictions on where a community license can be installed nor limitations in built executable files.

All other licenses are permanent and perpetual—they never stop working with the PGI release version for which they were created. FlexNet-managed licenses also work with all earlier PGI release versions back to 7.2. Except as noted below, all PGI permanent licenses are specific to individual PGI products running on a single operating system (e.g. PGI Fortran/C/C++ Workstation for Linux).

  • Node-Locked licenses restrict use to a particular host and one user at a time. They are very useful when a number of users want to share a PGI product. Node-locked licenses for x86 systems require the license service (lmgrd) run on the same machine the compilers are run. Executables can run on any other compatible machine. Node-locked licenses for OpenPOWER use a proprietary licensing scheme.
  • Floating licenses allow the license service to run on a machine different from the machines running the compilers. Floating licenses allow a mix of system types running different operating systems. The maximum number of concurrent users is determined by counting usage across all of the systems running the PGI compilers. Floating licenses usually require only a single license server. Users with floating licenses can "borrow" seats for out-of-office compiling.

    Starting with PGI 2016 release, floating license key formats for x86 are different than node-Locked license key formats. Node-locked license keys define the product with FEATURE sections, and allow only a single set of PGI license keys per license server to be served. Floating license keys define the product with INCREMENT sections, and allow multiple PGI floating license keys to use the same license server.

Can you please give me an overview of a PGI license?

Here is a synopsis of a typical set of PGI license keys. It is broken down into sections, with important parts labeled. The dashed lines and (!) comments are added, and the parts in parantheses are optional.

------------------------------------------------------------------------
SERVER <license server hostname> 0123456789ab 27000        !SERVER line
------------------------------------------------------------------------
DAEMON pgroupd (/path/to/pgroupd) (PORT=port_number)       !DAEMON line
------------------------------------------------------------------------
PACKAGE (or INCREMENT) PGI2015-<PGI PIN> pgroupd <support end date>  A13AB920D570 \
<…;>
6167 7015 3F05 9C37 2315 ACDF 1B73 DAA9 FBAE"

The SERVER line has three components, the hostname of the license server, the hostid of the license server, and the PORT used by lmgrd to process the license requests. You can edit the hostname and PORT used (27000) by hand without regenerating the license.

The DAEMON line has three components, the name of the DAEMON used (pgroupd), the path to the daemon, if not where lmgrd is located (as in /usr/pgi/daemon/pgroupd) and a PORT which pgroupd would use to communicate.

The PORT used (27000 for lmgrd, not designated for pgroupd) can be any unused integer that the Operating System allows. We do not know how to tell which PORT numbers would be successful, but you can change them by hand in the license file.

The path to pgroupd can be added to a DAEMON line if you want to make sure the license service uses a particular DAEMON. Newer DAEMONs can read the newer license file formats, as well as the older ones.

The first line of the License section begins with PACKAGE, for node-locked licenses. For floating licenses, the beginning of the line begins with INCREMENT. Each has a date-formatted number like 2016.1231 in it. This number designates when the license support expires, and it also tells you with which releases the license will work. For this case, the license support expires on December 31 2016, and this means the license will work with current releases and future releases up to 16.12 release(if it happens). You do not need to update this license until after you renew your support late in 2016.


What are the most common license problems?

The most common license problems are:

  • License support expired and the license does not qualify for this release.
  • The Linux Server system fails to have lsb, the Linux Standard Base package the License utils need.
  • Hostid used in the license does not match the machine the license server is running on.
  • The hostname used is not being recognized properly by the license utilities.
  • Using two or more PGI node-locked licenses on the same license server.
  • Firewalls prevent lmgrd and pgroupd from communicating.
  • Using a node-locked license on a machine other than the license server.

What is the hostid?

On Windows, Linux, and OS X, the hostid is the Ethernet address (a/k/a MAC) of the network card that is configured. Older releases supported only device eth0 for FlexNet style licensing, but releases 9.0 and later support multiple configured network cards. If your license works with the current release, it should work with previous releases back to 7.2. Making the current release license work often requires the version of pgroupd be the one included with the current release.

Whichever license is used, the hostid used to create the PGI license must not change, or the license validation will fail. For example, a laptop that disables the ethernet network card when not connected. In this case, the laptop should use a different hostid, or refrain from using the compilers when on wifi or no network.


How do I know if the license hostid has changed, and what should I do then?

The SERVER line of your license file has the form

SERVER hostname_of_the_license_server 001122334455 27000

In the above, 001122334455 is the hostid, and 27000 is the PORT used for lmgrd.

If you have a permanent license, run

lmutil lmhostid

in the Windows, Linux, or OS X environment. lmutil resides in the same directory as the PGI compilers.

See if any of the hostid values displayed agree with the one found in your license.dat file. There can be more than one displayed.

If the hostid has changed, then login to your account and create new license keys. If you don't have an account, you'll need to register first. The PIN Code used to tie the license PIN to your account can be found in the original PGI order confirmation you received at the time of purchase. If you do not have your order acknowledgment then please contact PGI License Support and request that your PIN(s) be tied to your account.


If the hostid has not changed and the license manager fails, what next?

Use the following checklist to eliminate the easy things. But first, run the following commands after the compilers have been installed and the environment set up. lmutil resides in the PGI bin directory.

lmutil lmhostid           ! obtain the hostids that are detectable by the flex utilities 
lmutil lmhostid -hostname ! obtain the hostname 
lmutil lmhostid -internet ! obtain the IPaddress of the platform. (Best if same as 'ping hostname' from previous.) 

The following applies to FlexNet style node-locked and floating licenses only. Most problems with running license daemons are relatively minor although they may not seem like it at the time.

  • Is the compiler looking at the right license file?

    You should see different behavior if you remove or rename the license.dat file. Check to see the $PGROUPD_LICENSE_FILE is set properly. $PGROUPD_LICENSE_FILE should be set to the full pathname of the license file, for node-locked (e.g. Workstation) licenses. For floating (SERVER, CDK) licenses, $PGROUPD_LICENSE_FILE can be the full pathname or of the form port_number_in_license@hostname_of_the_license_server. (Note: on the license server itself, $PGROUPD_LICENSE_FILE should only be set to the full pathname of the license file.)

    Windows expects the license to be at C:\Program Files\PGI\license.dat, and OSX expects the license to be at /opt/PGI/license.dat. Linux looks first at $PGROUPD_LICENSE_FILE, and then at $LM_LICENSE_FILE, and then $PGI/license.dat

  • Is the hostname of the license server in the SERVER line of the license file, a name that every machine using the compilers (including the license server) can use to communicate with the license server? In other words, will ping server_name on the client give the same IPaddress as ping server_name on the Server? Floating Licenses only work on a single network.

    If the hostname needs to change, you can edit the license file by hand without regenerating the license. If a license has worked before, try using the same hostname and PORT number (usually 27000) in the SERVER line, as the previous working license.

    ping hostname_of_the_license_server
    

    succesfully from license_server to itself (for FlexNet style node-locked), or from the client platform to the license_server (for FlexNet style floating licenses), it may be a candidate. But it may not be.

    The PGI FlexNet utility lmutil can provide a possible hostname_of_the_license_server. Try

    lmutil lmhostid -hostname
    

    which should return HOSTNAME=xxxxxx and the possible hostname_of_the_license_server would be xxxxxx in this case. But it may not work, due to DNS and other reasons.

    Many node-locked licenses can use the standard localhost name for the hostname_of_the_license_server and it often bypasses DNS.

    Users can edit by hand the hostname_of_the_license_server on the SERVER line of the license file without having to regenerate the license.

  • Does the hostid agree with the one in license.dat?

    See the previous section.

  • Is $PGI defined properly (e.g. export PGI=/opt/pgi) ?

    Entering the command

    set | grep PGI 
    

    will determine if it is properly defined.

  • Is $PGROUPD_LICENSE_FILE defined properly (e.g. export PGROUPD_LICENSE_FILE=/opt/pgi/license.dat)?

    Entering the command

    setenv | grep PGROUPD_LICENSE_FILE
    

    will determine if it is properly defined.

  • Has the license.dat file been modified improperly?

    This file must adhere to a specific format. Specifically,

    • no extra lines
    • no linefeeds or returns between lines ending with '\' and the next line(s).
  • Has the line "DAEMON pgroupd" in license.dat been modified to point to the pgroupd location? As an example

    DAEMON pgroupd /usr/pgi/linux86/16.1/bin/pgroupd 
    

    Note that PGI Linux Release 2014 (14.3) and later licenses require the pgroupd version 11.11.1.1 included with the release packages.

  • Could there be a problem with lmgrd or pgroupd?

    To be sure, stop them all and restart the license manager. Enter the following commands:

    ps ax | grep lmgrd
    

    or

    $PGI/linux86-64/16.1/bin/lmutil lmstat
    
    ps ax | grep pgroupd
    

    If processes lmgrd and pgroupd are running, kill them off. Then enter the command

    $PGI/linux86-64/16.1/bin/lmgrd.rc restart
    


Okay, the simple things check out, what next?

Check the Flexera Software (Flexera publishes FlexNet Publisher) support center.


Where can I find my PIN?

When communicating with PGI, please provide your PIN (Product Identification Number) with your inquiry. Your PIN can be found in your license file. Typically the license file is found at $PGI/license.dat (Linux and Mac OS X) or C:\Program Files\PGI\license.dat (Windows). In the license.dat file, look for

VENDOR_STRING=xxxxxx 

where xxxxxx is a six digit number starting with a 1, 5, 7, or 9. This is your PIN. If your file is not in $PGI, look at the location $PGROUPD_LICENSE_FILE.


How do I create/delete/move a license?

This question applies to those who have a paid license or a University Developer license. See below for information on deleting or moving trial licenses or Free PGI for OS X.

We recommend that you install the software first on the system you are using, or intend to move to, before trying to obtain a license for that system. If you are moving your compilers to a new system, DO NOT remove the compilers from your previous machine until things are working on your new system. Avoid leaving yourself with no working compilers in the event you have temporary license problem during the transfer.

Determine your PIN as described in the PIN FAQ above.

If you cannot tie your PIN to your account, send mail to PGI License Support, and provide as much info as you know about the original purchase so we can search. Usually, the name or email address of the purchaser is enough. More info may be needed if you have multiple licenses at your site.

To create a license, you will need a hostid for the device that is managing the license. If you have a node-locked or a Developer license, this is the hostid of the machine you installed the compilers on. If you have a network floating license, it is the hostid of the license server.

To move a license from one machine to another, log in to your PGI account and click on the "Create Permanent Keys" link. Each PIN tied to your account is linked to the license generator. First, delete your current license and then create a new license using your new hostid. You may have to wait for PGI to approve your request if you move or delete your license more often than twice a year.

We appreciate the effort you have to go through to create or change a license, and we apologize for these extra steps needed to preserve the integrity of our software products.

If you have a trial license, these steps are not required. Simply delete the license file to remove it, or copy it to another system to move it. Free PGI for OS X does not use license keys. Reinstall the Free PGI software to move it to another machine.


When can I use my license server to serve more than one set of PGI license keys?

Users with multiple PGI products might like the convenience and efficiency of a single computer to serve floating license keys. Either by merging the license keys into a single file, or by adding them to the paths in $PGROUPD_LICENSE_FILE, they assume that the license server can handle two or more PGI products in much the way the server can handle one PGI product and (for example) a MATLAB product.

In a similar vein, users with more than one node-locked, single user product license also might like to run multiple license keys on the same license server, giving more than one user simultaneous access to PGI compilers and tools on that computer.

It is possible now for PGI floating server licenses to be combined or have the license service manage multple licenses on the server.

However, multiple node-locked licenses cannot be served by the same license server. Either the server will fail, or it will only enable the first node-locked license read.

A FlexNet Publisher license server consists of a master daemon, named lmgrd, and a number of vendor daemons. The PGI vendor daemon is named pgroupd. The master daemon coordinates the license key requests from users and routes these requests to the appropriate vendor daemon. The vendor daemon actually manages check-out and check-in requests for each managed product.

Once the vendor daemon finds a license key that matches the requested FEATURE software components, no further processing is performed on the license key file. In other words, if there are multiple, equivalent PGI license keys in the license file, only one of those license keys will be served by the vendor daemon. The other license keys will not be recognized and will not be served.

For floating licenses, the INCREMENT software components will combine with other floating licenses, adding the seats and features of other licenses to the mix. You can, for example, use the same license server to host an Accelerator enabled license with a non-Accelerated one, and have the two licenses properly served.

Some customers try and run multiple instances of lmgrd on the same machine, with the intent of having each copy manage a unique set of PGI license keys. This will not work. FlexNet vendor daemons use a fixed-location in the shared filesystem as a lock file to insure that only one vendor daemon process can run on the license server computer.

In short, the rules are one PGI node-locked license key per license server, and multiple PGI floating licenses per license server.


What do I do when lmutil lmhostid returns a blank?

PGI licenses use FlexNet licensing components to manage compiler licenses. Licenses are generated based upon the unique output on each host of the FlexNet command:

lmutil lmhostid

On Linux systems, this command returns the MAC address of the network cards configured. Older releases (pre 9.0) only returned hostids for net cards configured as eth0. This does not mean you have to be connected to the Internet, but it needs to be "visible" at runtime. For example, if you have root privilege and type /sbin/ifconfig, you will see something like this:

/sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:BE:FE:BE:EF:FD  
          inet addr:167.6.543.21  Bcast:167.1.234.456  Mask:255.255.254.0
          inet6 addr: beef::222:beef:feed:1234/64 Scope:Link
          UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1234560 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0654321 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:7777 
          RX bytes:1212121212 (4321.0 Mb)  TX bytes:54321234 (567.8 Mb)
          Interrupt:123 

In this case, the output of lmutil lmhostid is '00befebeeffd'.

Most flex utils newer than 11.11.1.1 can read most of the hostids on confgured Linux, Windows, and MAC systems. If your network cards exist and have 16 digit MAC addresses, unless lmutil lmhostid from the same release as the PGI compilers can find them, you cannot use them as hostids for that release of compilers, and compilers that cannot read the hostid when running on the license server.


How do I use a Windows machine as a floating license server?

Even if you have a floating license for PGI's Linux only compilers, you can still use a 32-bit or 64-bit Microsoft Windows based machine as a license server. Node-locked licenses must use the machine running the compilers as the license server.

To use the machine as a license server, download and install the PGI Windows compilers in the default directory, C:\Program Files\PGI.

From the PGI command window, determine the hostid of the license server by typing lmutil lmhostid, and choosing one of the hostids to use when generating your license keys. The hostname can be found by typing uname -n in the PGI command window.

Copy your license keys into C:\Program Files\PGI\license.dat

Go to Start | Control Panel | Administrative Tools | Services, and select PGI License Server and start or restart the license server, or run lmutil lmreread in the PGI command window.

On every machine that runs the compilers set the environment variable $PGROUPD_LICENSE_FILE to port@hostname. For example, with hostname "hal", export PGROUPD_LICENSE_FILE=27000@hal should work.


Why am I seeing the error '$PGI/linux86-64/16.*/bin/lmutil: No such file or directory'?

PGI Release 2016 compilers use versions of flexlm that require Linux Standard Base aka lsb be present. You can still use the flexlm software that came with Release 2011, but you will eventually need lsb installed on Linux.

The most common symptom of the problem will be a message when you run the compilers

$PGI/linux86-64/16.*/bin/lmutil: No such file or directory 

To determine if the problem is lsb, type

% lsb_release 

If it returns core-3.0 or higher, then you have lsb. If less than 3.0 or 'n/a' is returned, you need to install it, and then run the compilers.

For Ubuntu, the process is

%  apt-get install lsb

On other Linux versions, you need to install it from the RPM.

Recheck lmutil again and see if things work. If not, try reinstalling the PGI compilers.


Can I check-out one of our network floating licenses to use on the road?

Beginning with PGI Release 7.2, users with network floating license for PGI products can "borrow" a license to use when not connected the license server.

The following example illustrates using this feature under Linux. Operation under OSX and Windows is similar.

To borrow a pgfortran license, using today's date, do the following:

  1. In your shell, enter

    % lmborrow pgroupd 21-may-2015
    lmborrow - Copyright (c) 1989-2015 Acresso Software Inc. and/or
    Acresso Corporation. All Rights Reserved.
    Setting LM_BORROW=21-may-2015:pgroupd:21-may-2015
    
  2. Compile a prgram using pgfortran:

    % pgfortran hello.f
    

    A license to use pgfortran is now borrowed. You will need to repeat this process for each compiler and tool that you wish to borrow. Verify that you're borrowing licenses successfully.

    % lmborrow -status
    lmborrow - Copyright (c) 1989-2015 Acresso Software Inc. and/or
    Acresso Corporation. All Rights Reserved.
    Vendor     Feature                             Expiration
    ______     ________                            __________
    pgroupd    pgfortran-lin64                   21-May-15 23:59
    

    You can now disconnect from the network and use the compiler and tools that you checked out.

    If you wish to return the license before the expiration date, connect to your network and use the lmborrow -return command (see -help for the complete syntax). For example,

    % lmborrow -return pgfortran-lin64
    

    Again, verify that the license was returned using the lmborrow -status command:

    % lmborrow -status
    lmborrow - Copyright (c) 1989-2007 Acresso Software Inc. and/or
    Acresso Corporation. All Rights Reserved.
    

For a complete list of commands enter

% lmborrow -help
lmborrow - Copyright (c) 1989-2007 Acresso Software Inc. and/or
Acresso Corporation. All Rights Reserved.

Usage:  lmborrow {all|vendorname} dd-mmm-yyyy [hh:mm]     (To borrow)
        lmborrow -status           (Report features borrowed to this node)
        lmborrow -clear            (Changed your mind -- do not borrow)
        lmborrow -return [-c licfile] [-d display_name] [-fqdn] feature
                                   (Return feature early)
        lmborrow -help             (Display usage information)

The syntax of the check out is:

lmborrow {all|vendorname} dd-mmm-yyyy [hh:mm]

Where all indicates all license vendors and vendorname indicates a specific vendor (e.g., pgroupd for PGI compilers/tools). dd-mmm-yyyy is the date that you anticipate returning the license and [hh:mm] is an optinal return time using 24 hour notation (i.e., 11pm is 23:00). By default the return time is 23:59.


How do I stop the PGI License Service?

To stop the License service on Linux and OS X, locate the lmgrd process and kill it. If the license service is used by other non-PGI products, they will no longer have access as well.

% ps ax | grep lmgrd

 2786 ?        S      0:00 /opt/pgi/linux86-64/16.1/bin/lmgrd -c /opt/pgi/license.dat
18278 pts/0    S+     0:00 grep lmgrd

% kill -9 2786

To stop the license service on a Windows system use Control Panel->Admin Tools->Services->PGI and then right-click on PGI to STOP the service.


My problem is isn't listed here. What do I do know?

If after reading this FAQ you're still having license problems, you may wish to search the PGI Licensing and Installation User Forum. Chances are someone else has run into your problem already. If you don't find an answer to question in the User Forums, or if you have questions about your license status, how to generate new license keys, or how to get your original order confirmation, send email to PGI License Support.


I'm new to PGI. Which compiler flags should I be using?

For users interested in compiling their code to run reasonably fast, we provide this list of recommended default flags. For users interested in achieving peak performance, a list of tuning flags follows below.

PGI Recommended Default Flags

Compiler Flags
PGFORTRAN -fast -Mipa=fast,inline
PGCC -fast -Mipa=fast,inline -Msmartalloc
PGC++ -fast -Mipa=fast,inline -Msmartalloc

Where:

-fast A generally optimal set of options including global optimization, SIMD vectorization, loop unrolling and cache optimizations.
-Mipa=fast,inline Aggressive inter-procedural analysis and optimization, including automatic inlining.
-Msmartalloc Use optimized memory allocation (Linux only).

PGI Tuning Flags

Flag Usage
-Mconcur Enable auto-parallelization; for use with multi-core or multi-processor targets.
-mp Enable OpenMP; enable user inserted parallel programming directives and pragmas.
-Mprefetch Control generation of prefetch instructions to improve memory performance in compute-intensive loops.
-Msafeptr Ignore potential data dependencies between C/C++ pointers.
-Mfprelaxed Relax floating point precision; trade accuracy for speed.
-tp=cpua,cpub Create a PGI Unified Binary for two or more cpu types, which functions correctly on and is optimized for two cpus. For example, -tp=sandybridge,bulldozer optimizes for both Intel 'Sandybridge' and AMD 'Bulldozer' cpu types.
-Mpfi/-Mpfo Profile Feedback Optimization; requires two compilation passes and an interim execution to generate a profile.

Please see the PGI Compiler Reference Manual for detailed flag information. Find more specific information for tuning many popular community applications on the Porting & Tuning Guides page.


What are Intrinsics and how do I call them directly?

Inline intrinsic functions map to actual x86 or x64 machine instructions. Intrinsics are inserted inline to avoid the overhead of a function call. The compiler has special knowlege of intrinsics, so with use of intrinsics, better code may be generated as compared to extended inline assembly code. Intrinsics are available in C and C++ programs running on Linux or Windows only.

See the PGI Compiler User's Guide, Chapter 15, for more information about intrinsics.


Which versions of the OpenMP specification do you support?

PGI supports through the OpenMP 3.1 specification.


How do I remove the 'FORTRAN STOP' message when 'STOP' is encountered?

Declare the NO_STOP_MESSAGE environment variable and assign it any value.


What do I do when the compiler intermittently fails with SIG interrupt or bad output?

If any of the compilers fail from time to time with a SIGSEGV or SIGNAL 11 interrupt, it could be your temp directory.

A problem like this where a compile has been terminated

% pgfortran x.f90 -fast 
pgf90-Fatal-/usr/pgi/linux86-64/12.0/bin/pgf901 TERMINATED by signal 11
Arguments to /usr/pgi/linux86-64/12.0/bin/pgf901
/usr/pgi/linux86-64/12.0/bin/pgf901 x.f90 -opt 2 -terse 1 -inform warn -nohpf -nostatic 
-x 19 0x400000 -quad -x 59 4 -x 59 4 -x 15 2 -x 49 0x400004 -x 51 0x20 -x 57 0x4c 
-x 58 0x10000 -x 124 0x1000 -x 57 0xfb0000 -x 58 0x78031040 -x 70 0x6c00 -x 47 0x400000 
-x 48 4608 -x 49 0x100 -x 120 0x200 -stdinc /usr/pgi/linux86-64/12.0/include:/usr/local/
include:/usr/lib/gcc/x86_64-redhat-linux/4.1.2/include:/usr/lib/gcc/x86_64-redhat-linux/
4.1.2/include:/usr/include -def unix i -def __unix -def __unix__ -def linux -def __linux 
-def __linux__ -def __NO_MATH_INLINES -def __x86_64__ -def __LONG_MAX__=9223372036854775807L 
-def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW= 
-def __extension__= -def __amd64__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ 
-def __SSSE3__  -preprocess -freeform -vect 48 -y 54 1 -x 53 2 -quad -x 119 0x10000000 
-modexport /tmp/pgf90Em9bAdWSVOUt.cmod -modindex /tmp/pgf90om9bQEWzq5Vy.cmdx 
-output /tmp/pgf90Um9bkYHrQ0B4.ilm
              ^^^^^^^^^^^^^^^ 

There are a number of possible reasons why the program seg faults during compilation. Two reasons that often come up are 1) temporary directories ($TMPDIR by default set to /tmp) that are not large enough to handle the intermediate files the compilers create, and 2) not enough stack space (limit stacksize nnnn) for the compilers need.


Why don't the PGI Fortran compilers handle #include statements the same as include statements?

The statements

         #include "filename"
         include "filename" 

are handled differently in pgf77 and pgfortran. #include is a preprocessor statement, while the indented include statement is handled by the front end of the compiler.

To handle files with #include statements, either rename the file from x.f to x.F, or use the switch ‑Mpreprocess in your compile line.


How do I easily call pgcc as cc?

Some users want to keep gcc, and call pgcc as cc. The easiest way to do this is to create a script file named cc and make sure your path is set up to find the cc script file. The script file, using csh syntax, is:

      #!/bin/csh
      setenv PGI /usr/pgi      #! or wherever pgcc is installed
      set path=($PGI/linux86/bin $path)
      pgcc $* 

If the PGI environment variable is already set, then delete the setenv command.

We don't recommend renaming pgcc to cc. There are several changes necessary for this to work correctly, and each new release can cause problems due to changes in the driver structure.


Why does the F2003 SIZE intrinsic not handle large arrays properly in 64-bit?

The SIZE intrinsic is set to the same function type as the default INTEGER, which is four bytes with the 64-bit compilers. Compiling with -i8, the default integer size is eight, and the SIZE intrinsic will now be eight bytes.


What can I do about precision problems?

The x86 floating point processor performs all of its computations in extended (80-bit) precision. This may cause problems when porting code to the x86 which successfully executes on other (non-x86) systems. The increased precision of the x86 may result in 'different' answers. Also, the increased precision of the x86 may result in infinite loops if equality tests of floating point data are used to control while loops. Examples of problem cases:


  1.    a = <expression>   ! 'copy propagate' a's right-hand side to its use
       b = a + c          ! 'propagate' b
       if (b .eq. y ) ... ! 'exact equality' check
    

  2.    while ( C.EQ.ONE )
           LT = LT + 1
           A = A*LBETA
           C = DLAMC3( A, ONE )
           C = DLAMC3( C, -A )
       END WHILE
    

To reduce the precision, the compilers options ‑pc 64 (round floating point operations to double precision) or ‑pc 32 (round floating point operations to single precision) may be used.

The ‑Kieee switch may be used to disable propagating floating point values and to round the argument values passed to intrinsics (sin, cos, etc.).


How come we get different answers on one platform versus a Linux x86 platform?

The x86 architecture implements a floating-point stack by using 8 80-bit registers. Each register uses bits 0-63 as the significant, bits 64-78 for the exponent, and bit 79 is the sign bit. This extended 80-bit real format used by floating instructions is the default. When values are loaded into the floating point stack they are automatically converted into the extended real format. The precision of the floating point stack can be controlled, however, by setting the precision control bits (bits 8 and 9) of the floating control word appropriately. In this way, the programmer can explicitly set the precision to standard IEEE double or single precision (the Intel documentation, however, claims that this only affects the operations of add, subtract, multiply, divide, and square root.)

We have also noticed that, although extended precision is supposedly the default which is set for the control word, it is set at double precision in the x86 Linux systems. Thus, we now also have a ‑pc <val> option which can be used on the command line. The values of <val> are:

       32 => single precision
       64 => double precision
       80 => extended precision

At first glance, an extra 16 bits of precision appears to only be a positive asset. However, operations that are performed exclusively on the floating point stack, without storing into (or loading from) memory, can cause problems with accumulated values within those 16 bits. This can lead to answers, when rounded, that do not match expected results.

We briefly look at several examples which have been encountered. First, we have recently implemented the evaluation of most transcendental functions inline, such as sin, cos, tan, and log, since there are x86 instructions for their direct computation. However, as an example, if the argument to sin is the result of previous calculations performed on the floating point stack, then an 80-bit value vs. a 64-bit value can result in slight discrepancies in the answer. With our sin example, we have seen results even change sign due to the sin curve being so close to an x-intercept value when evaluated. Consistency in this case can be maintained by calling a function which, due to the ABI, must push its arguments on the stack (in this way memory is guaranteed to be accessed, even if the argument is an actual constant.) Thus, even if the called function simply performs the inline expansion, using the function call as a wrapper to sin has the effect of trimming the argument precision down to the expected size. Using the ‑Mnobuiltin option on the command line for C accomplishes this task by resolving all math routines in the library libm, thus performing a function call of necessity. The other method of generating a function call for math routines, but one which may still produce the inline instructions, is by using the ‑Kieee switch, described below.

A second example which illustrates the precision control problem can be seen by examining this code fragment adapted from the benchmark "paranoia", used to validate IEEE compliance. This section of code is used to determine machine precision:

        program find_precision  
        w = 1.0
100     w=w+w
        y=w+1
        z=y-w
        if (z .gt. 0) goto 100
C       ... now w is just big enough that |((w+1)-w)-1| >= 1 ...
        print*,w
        end

In this case, where the variables are implicitly real*4, operations are performed on the floating point stack where optimization removed unneeded loads and stores from memory. The general case of copy propagation being performed follows this pattern:

         a = x
         y = 2.0 + a

Instead of storing x into a, then loading a to perform the addition, the value of x can be left on the floating point stack and have 2.0 added to it. Thus, memory accesses in some cases can be avoided, leaving answers in the extended real format. If copy propagation is disabled, stores of all left-hand sides will automatically be performed, and reloaded when needed. This will have the effect of rounding any results to their declared sizes.

For the above program, w has a value of 1.8446744E+19 when executed as is (extended precision.) However, if ‑Kieee is set, the value becomes 1.6777216E+07 (single precision.) This difference is due to the fact that ‑Kieee disables copy propagation, so all intermediate results are stored into memory, then reloaded when needed. (Actually, copy propagation is only disabled for floating point operations, not integer,when the ‑Kieee switch is set.) Of course, with this particular example, setting the ‑pc switch will also adjust the result.

The switch ‑Kieee also has the effect of making function calls to all transcendental routines. Although the routine still produces the machine instruction for computation (unless in C the ‑Mnobuiltin switch is set), arguments are passed on the stack, which results in a memory store and load.

The final effect of the ‑Kieee which we discuss is to disable reciprocal division for constant divisors. That is, for a/b with unknown a and constant b, the expression is converted at compile time to a* 1/b, thus turning an expensive divide into a relatively cheap multiplication. However, small discrepancies can again occur, resulting in differences from expected answers.

Thus, understanding and correctly using the ‑pc, ‑Mnobuiltin, and ‑Kieee switches should enable the user to produce the desired and expected precision for calculations which utilize floating point operations.

Note: Current x86/x86-64 CPUs have SSE1,SSE2, SSE3, etc instruction sets which perform 32-bit and 64-bit floating point operations in a vectorized manner. This greatly reduces any precision discrepancies between x86 and other CPU types.


Why when I execute do I get the error message 'libpgc.so: cannot open shared object file'?

The current Linux releases feature a shared libpgc.so along with libpgthread.so. Before, the libraries libpgc.a and libpgthread.a were used, and they are still available.

The default link will use a shared libpgc and libpghtread, so that users can build one executable for several versions of Linux. Building your application with, for example, pgcc on a RHEL 5.2 system, it can also run on a SuSE 11-3 system. Using gcc, however, the executable created should run on any Linux system with libc installed.

We have done the same thing with libpgc.so and libpgthread.so. If you wish to execute code built on your (for example) Ubuntu 10.04 system, on another RHEL 6.1 system, simply copy libpgc.so and libpgthread.so from $PGI/linux86-64/2012/REDIST to the target system, and add the directory you placed it in to the LD_LIBRARY_PATH path environment variable.

As an example, if we build 'hello world' on a RHEL 6.0 system

% more hello.c
main(){printf("hello\n");}
% pgcc -o hi hello.c
% hi
hello

To run this program on platform B, which is Ubuntu 11.04

% rcp hi  B:/your_B_dir/hi   ! copy executable to B
% rcp $PGI/linux86-64/2012/REDIST/*.so B:/tmp/.
% rsh B 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/tmp
(note if "rsh 'echo $LD_LIBRARY_PATH'" indicates it is not defined, use
% rsh B 'export LD_LIBRARY_PATH=/tmp' )    ! update dynamic lib path on B
% rsh B '/your_B_dir/hi' ! run executable on B
hello

If libpgc.so does not exist in $PGI/linux86-64/2012/libso, the linkage will be performed on the $LD_LIBRARY_PATH contents, if the shared library is present.


Do you have any relative performance numbers?

While performance is a very important reason for using the PGI compilers, typically we do not publish any relative performance numbers. Performance depends upon too many factors to make a credible claim that we are N% faster than a competitor. The only true measure is how your application performs on your system. Please download an evaluation copy of the PGI compilers and try it out.

Two organizations that do publish performance results are Standard Performance Evaluation Corporation (SPEC) and Polyhedron. Polyhedron also allows you to download the benchmark source code so you can do your own performance comparison. Again, when looking at these results remember that the only true benchmark is your application.


Do you have an example of using -byteswapio?

Here is an example of using the -byteswapio switch.

Example

% more rtest.f
      program test
      real*4 ssmi
      OPEN(UNIT=10,FILE='ice.89',FORM='UNFORMATTED')
      read(10) ssmi
      print *,'OK: ',ssmi
      end

% more wtest.f
      program test
      real*4 ssmi
      ssmi = -999
      OPEN(UNIT=10,FILE='ice.89',FORM='UNFORMATTED')
      write(10) ssmi
      print *,'OK: ',ssmi
      end

On your Sun workstation (or other big-endian device)

f77 -o w_sparc wtest.f
f77 -o r_sparc rtest.f

On your PGI workstation.

pgfortran -o w86 wtest.f
pgfortran -o w86_swap -byteswapio wtest.f
pgfortran -o r86 rtest.f
pgfortran -o r86_swap -byteswapio rtest.f

------------------------------------------
If you write the file  |  Then read the file
  ice.89 with          |  ice.89 with

  w_sparc or w86_swap  |  r_sparc or r86_swap

    w86                |  r86


Does the License Manager allow me to execute on other platforms?

Executables created by PGI compilers are not licensed. There are no license requirements for executables created on any platform. If you compiled with a trial license, the executable will stop functioning after the trial period expires. To prevent this, you will need to recompile codes after permanent license keys are installed.


Why does read(9,rec=recnr,end=100,err=101,iostat=ios) act differently on other compilers?

For statements like

    read(9,rec=recnr,end=100,err=101,iostat=ios) buf

some machines will jump to 100 and return IOSTAT of -1 upon getting to the end of the file, while other machines go to 101 or err exits.

The f77 & f90 standards distinguish between 'error conditions' and 'end-of-file'. The 'correct' (standard conforming, portable) way of writing the read statement to capture 'errors' and 'end-of=file' like the following:

    read(card(iarg:),*,err=701,end=701) iskip, irec1

It is true that the SGI treats an end-of-file condition as an error condition if the ERR= specifier is present and the END= specifier is not present. However, this behavior is inconsistent across systems (for example, HP & g77 both abort execution and report an end-of-file).

Another test case shows another inconsistency in various implementations. Consider this test:

        open(unit=10,file='foo',form='unformatted')
        read(10, err=99, iostat=ios) yy
        print *, 'fail1', ios
        stop
99      continue
        print *, 'fail2', ios
        end

According to the standards, if the 'err' branch is taken, the iostat variable will be defined with a positive value. Given that the SGI takes the ERR= branch in the original example, this test should take the ERR= branch as well. But on the SGI, this test executes as:

 fail1          -1

[NOTE that -1 => end-of-file]

The DEC alpha is another system where the ERR= branch is taken for the original example. But, the test above executes as:

 fail2          -1

But in this case, ‑iostat shouldn't be negative since the ERR= branch was taken.

The point of all this is that there are inconsitencies in the way ERR= is handled given an 'end-of-file' condition. Adding the END= specifier to your example guarantees consistent behavior across 'all' systems.


(32-bit Linux compilers) Why can't my compiled code handle even half of the 2GB of memory in my system?

Users now are capable of buying machines with > 4GB of memory in them, so they expect to be able to declare very large arrays. Most understand that the accessible limit ought to really be 2GB for a 32-bit addressable system, when you assume that signed ints may be involved in libraries that work with addresses.

Here are some things we have learned, from users who were more familiar with Linux.

  1. The Linux kernel places shared libraries at 0x40000000 by default,so on x86 you have only about 1GB total for your program code and other elements you provide. It has nothing to do with gcc.

    Possible solutions: (a) link statically, such that there aren't any shared libraries. Or (b) use malloc() to allocate the arrays. That should give you about 3GB total (but note that malloc() can't allocate a single chunk larger than 2GB).

         -Wl,-Bstatic
    

    will force a static link.

  2. If you wish to modify the kernel, in the kernel source, in file mmap.c, there is a line that reads:

         addr = TASK_UNMAPPED_BASE;
    

    This is what sets the default address of the shared libs in the memory mapping, and it's at 0x40000000 (1G) by default. So change it to, for example:

         addr = 0x80000000;
    

    And you should, in theory, have up to ~2GB to use for the codes.

  3. For more info on this, check out the comp.os.linux.development.system newsgroup.

Bottom line is

  1. it is an OS problem, not a compiler problem.
  2. Sometimes, you may have to do a lot of work on Linux to use all of your memory.

When executing, why do I get a stack overflow on Windows?

The error exhibits itself as either stack overflow or sometimes the program just hangs.

To enlarge the stack space, edit the driver file C:\Program Files\PGI\win32\[RELEASE#]/bin/win32rc, and change the line

  LDARGS=""

to something like

  LDARGS="-stack 10000000,50000"

which will enlarge the stack area (maximum size, commit size) of the executable. Relink your application and execute.


When I compile with -Ktrap=inexact, the program gets many exceptions

The PGI compilers do not support exception free execution for ‑Ktrap=inexact. The purpose of the hardware support is for those who have specific uses for its execution, along with the appropriate signal handlers for handling exceptions it produces. It is not meant for normal floating point operation code support.


Why do I get 'illegal instruction' exceptions when I run the program on another platform?

This usually indicates that you either have a CPU that does not support the assembly instruction the compiler generated, probably because the code was generated for a newer CPU than is currently executing. You can force the compiler to choose to generate code for an older CPU, or you can generate a PGI Unified Binary executable that combines a new CPU type with an older CPU type for everywhere else.

The instruction

  pgpfortran -V

will output the type of CPU that the PGI compilers determine your machine has.


Why to I get different answers with -r8 set when I declared all my variables as REAL*8?

The Fortran standard does not define the default size of constants. Our compiler treats constants as REAL*4 unless you compile with ‑r8. So a program like

       real*8  x,y,z
       x=50453.61
       y=29581.28
       z=x*y
       write(*,10)x,y,z, 50453.61*29581.28
 10    format(4f20.5)
       end

will produce different answers with ‑r8 set. Code that will treat constants as REAL*8 everywhere should be written as

       real*8  x,y,z
       x=50453.61D0
       y=29581.28D0
       z=x*y
       write(*,10)x,y,z, 50453.61D0*29581.28D0
 10    format(4f20.5)
       end


How do I detect infs and NaNs in pgfortran?

REAL*4 and REAL*8 numbers have a specific format for their exponent and mantissa. Exponents larger than the format limits are printed as inf. 32-bit and 64-bit numbers with mantissas falling outside the boundaries of the ieee formats are printed as NaN. Overflows or division by zero can result in inf values, while illegal operations such as sqrt(-1.) and log(-10.0) cause NaN.

The x86-64 architecture has status bits set by floating point operations. pgfortran traps some of those status bits ensuring that they are not set as a side effect of compiler generated code. Specifically, the status bits ovf for overflow, divz for divide by zero and inv for invalid. Other status bits, including inexact, align, unf, and denorm are not guaranteed to not be set as a side effect of normal compiler operation.

Some examples follow. To create infs and NaNs at execution time, you may need to code in such a way as to prevent the compiler from catching your improper operation.

program infnan
  real(8) v,w,x,y,z
  v = -1.0d0
  w = 123456789.0d0
  x = v * 10.d0 * log10(v)
  y= exp(w)
  z=1.0d0 /(v + 1.0d0)
  print *,'x (inv ?) =',x
  print *,'y (ovf ?) =',y
  print *,'x (divz?) =',z
end program infnan

% pgfortran -o infnan infnan.f90
% ./infnan
 x (inv ?) =                       NaN
 y (ovf? ) =                       Inf
 x (divz?) =                       Inf

The F2003 standard states the status bits set will be printed after a STOP statement.

program infnan2
  real(8) v,w,x,y,z
  v = -1.0d0
  w = 123456789.0d0
  x = v * 10.d0 * log10(v)
  y= exp(w)
  z=1.0d0 /(v + 1.0d0)
  print *,'x (inv ?) =',x
  print *,'y (ovf? ) =',y
  print *,'x (divz?) =',z
  stop
end program infnan2

% pgfortran -o infnan2 infnan2.f90
% ./infnan2
 x (inv ?) =                       NaN
 y (ovf? ) =                       Inf
 x (divz?) =                       Inf
Warning: ieee_invalid is signaling
Warning: ieee_divide_by_zero is signaling
Warning: ieee_inexact is signaling
FORTRAN STOP

If you wish the program to terminate when a operation causes a status bit change, compile with the following switches: ‑Ktrap=ovf,unv,divz or its shorthand ‑Ktrap=fp. You can also set the environment variable PGI_TERM to determine what happens when the status bit are set. You can even invoke the PGDBG debugger upon the event.

% pgfortran -o infnan infnan.f90 -Ktrap=divz -g
% ./infnan
Floating exception

% setenv PGI_TERM 'signal'
% ./infnan
Error: floating point exception, integer divide by zero

% setenv PGI_TERM 'signal,debug'
% ./infnan
Error: floating point exception, integer divide by zero
PGDBG 14.1-0 x86-64 (Cluster, 256 Process)
The Portland Group - PGI Compilers and Tools
Copyright (c) 2014, NVIDIA CORPORATION.  All rights reserved.
Loading symbols from /my/dir/infnan ...
Loaded: /my/dir/infnan
Stopped at 0x7F229025219A, function __waitpid
0x7F229025219A:  48 3D 0 F0 FF FF       cmpq   $0xFFFFFFFFFFFFF000,%rax

pgdbg> file /my/dir/infnan.f90
"infnan.f90"
pgdbg> list
 #1:     program infnan
 #2:       real(8) v,w,x,y,z
 #3:       v = -1.0d0
 #4:       w = 123456789.0d0
 #5:       x = v * 10.d0 * log10(v)
 #6:       y= exp(w)
 #7:       z=1.0d0 /(v + 1.0d0)
 #8:       print *,'x (inv ?) =',x
 #9:       print *,'y (ovf? ) =',y
 #10:      print *,'x (divz?) =',z

pgdbg>

If you want to determine if any element of a REAL array is a NaN, use the IEEE_ARITHMETIC routine ieee_is_nan(x) which takes real arguments, and because it is elemental, we can feed the entire array to it.

program infnan3
  use,intrinsic::IEEE_ARITHMETIC
  real(8) x(1000),y
  integer i
  do i=1,1000
     x(i)=i
  end do

  y = -1.0d0
  x(500) = y * 10.d0 * log10(y)

  if(any(ieee_is_nan(x))) then 
     print *, "we found a NaN!"
  else 
    print *, "we found NO NaNs!"
  end if
end program infnan3


% pgf90 -o infnan3 infnan3.f90
% ./infnan3
 we found a NaN!

With this technique, we can find whether an array has one or more NaNs, but not the index of the failing element. To do that, we need to use a loop.

program infnan4
  use,intrinsic::IEEE_ARITHMETIC
  real(8) x(1000),y
  integer i
  do i=1,1000
     x(i)=i
  end do

  y = -1.0d0
  x(500) = y * 10.d0 * log10(y)
  x(600) = y /(y + 1.0d0)
  do i=1,1000
   if(ieee_is_nan(x(i))) then
      print *,"X(",i,") is a NaN!"
   endif
   if(.not.(ieee_is_finite(x(i)))) then
      print *,"X(",i,") is an inf!"
   endif
   end do
end program infnan4

% pgfortran -o infnan4 infnan4.f90
% ./infnan4
 X(          500 ) is a NaN!
 X(          500 ) is an inf!
 X(          600 ) is an inf!


Why does date_and_time(DATE) return 20 for the century?

Century is just defined to be a group of 100 years. For 2012, date_and_time(DATE) sets the century and year values in DATE to 20 and 12, respectively. So, for example, 2012 = 20*100 + 12

Here is a Fortran program that uses date_and_time().

program testread
  implicit none
  integer::cen,year,mon,day
  character(len=8) :: date
  call date_and_time(date)
  read(date,'(4i2)')cen,year,mon,day
  print *,cen,year,mon,day
end program

What should I consider when building executables to run on other versions of Linux?

To build executables for portability across multiple Linux distributions, procedures can be linked statically at build time into the executable on the build machine, or they can be linked dynamically at runtime on the target machine. If you link dynamically, you may need to carry the procedure libraries over to the target machine if versions don't already exist there in a location where the executable expects to find them.

  • Compile for the oldest x86 CPU you wish to support, or create a PGI Unified Binary. For example, compiling with the options –⁠tp=p7,nehalem will generate p7 type code for older CPUs as well as code optimized for the Nehalem-class CPUs and newer.
  • Both PGI and gcc libraries try to be forward compatible. If your executable builds on the oldest Linux platform and the oldest gcc version you wish to support, it has a good chance of working on newer Linux distributions and gcc versions. If the span between the oldest and the newest Linux/PGI/gcc versions is very large, changes in your code can result in problems. Header file and procedure interfaces could have incompatibilities requiring a narrower range of software versions to function properly.
  • You may either copy files in the PGI REDIST directory to the new target system you wish to run your executable, or you can link with –⁠Bstatic_pgi switch to ensure that all PGI-specific routines are linked statically, and thus are part of the executable. –⁠Bstatic_pgi allows system and libc routines to link dynamically so that the versions of those routines residing on the target system will be used at execution.
  • The performance of OpenMP codes can sometimes be improved by using the libnuma library, which is linked in by default if properly installed on your system. If you have OpenMP directives in your code, compile with both -mp and -mp=nonuma and compare performance. If you wish to build OpenMP executables that will run on machines with and without the libnuma library present, build the executable with -mp=nonuma (or -nomp).
  • Use the ldd command along with your executable's name (ldd foo.exe for example) on the build machine to determine where your executable expects to find the dynamically linked libraries on the target machine. You can preempt these locations by specifying a directory or directories using the $LD_LIBRARY_PATH environment variable on the target system.


What should I consider when building executables to run on other versions of Windows?

To build executables for portability across multiple Windows platforms, consider the following.

  • Compile for the oldest x86 CPU you wish to support, or create a PGI Unified Binary.
  • Compile on the oldest Windows version, to take advantage of the forward compatibility.
  • Note that prior to Windows 7, PGI compilers use a different tool chain (assemblers, linkers and runtime libraries collectively known as the Windows SDK) than used with Windows 7 and newer. PGI does not support executables built with the older pre-Windows 7 SDK working on Windows 7 and newer versions.


What should I consider when building executables to run on other versions of OS X?

To build executables for portability across multiple OS X platforms, consider the following.

  • Compile for the oldest x86 CPU you wish to support, or create a PGI Unified Binary.
  • Compile on the oldest OS X and Xcode versions, to take advantage of the forward compatibility.
  • Xcode version 4.3 and later use Clang. PGI executables built with non-Clang based Xcode versions are not supported on systems where PGI uses Clang based Xcode versions.


Will every program I build run everywhere?

No. Much depends on what system you build it on, and how much your program uses system routines that have changed from OS release to OS release. To reduce the number of porting issues, we recommend you replace system routine calls with calls to standard Fortran, C, and C++ procedures which are available in every version of a standard-compliant compiler.


With pgcc/pgc++, I have trouble opening binary files on Windows, but not Linux. Why?

To port C code containing reads/writes of binary files to Windows, follow these steps:

  • for open(), include O_BINARY in the second argument (oflag) passed to open(). For example:

    fd = open( "FILE", O_CREAT|O_BINARY );
    

  • for fopen(), include 'b' in the second argument (mode) passed to fopen() like this:

    fp = fopen( "FILE", "wb" );
    


Are there licensing issues with running executables on other platforms?

Executable files built with any PGI license do NOT require a separate license for distribution (a/k/a a runtime license). Any files or libraries located in the PGI REDIST directory can be distributed for use with PGI compiled applications, within the provisions of PGI End-user License Agreement.

Executable files created using a FlexNet-based PGI license are permanent and perpetual (do not expire), and may be moved to other systems. Executable files created using a PGI Developer license are also permanent and perpetual but are restricted to running only on the same system as they were compiled. Executable files created using PGI temporary licenses (starter, trial, demo) are time limited. The time limit can be removed by recompiling with a permanent license.


How do you call C and C++ routines from pgfortran(ISO_C_BINDING)?

The answer is ISO_C_BINDING, a standard for Fortran-to-C interlanguage communication. Properly used, ISO_C_BINDING should work across many current Fortran compilers. However, it is limited to passing arguments to C routines, not C++. Be sure that the 'C' entry points are present in any C++ routines that are called (extern "C").

The data types in Fortran and C are different. Different Fortran compiler may treat what looks like the same data type (integer, real, pointer) differently than C and C++.

ISO_C_BINDING allows you to declare interfaces that target C routines and data types. This means you don't need to modify the C code you want to call from Fortran. In short, ISO_C_BINDING lets you treat C code as if it were just another Fortran routine.

For example, to call this C subroutine

more csub_a.c: 

#include <stdio.h>
void foo(int *int_arg1, float *real_arg2)
{
  printf("The arguments passed are int_arg1=%c\n and real_arg2=%f \n",*int_arg1, *real_arg2);
 
}

from this Fortran main


more fmain_a.f90:

PROGRAM FORTRAN2C
INT_ARG1=1234
REAL_ARG2=5678.9
PRINT *,"passing integer ",INT_ARG1," and real ",REAL_ARG2
CALL FOO(INT_ARG1,REAL_ARG2)
END

you would write a Fortran interface record like this

added interface lines: 

INTERFACE
    SUBROUTINE FOO(INT_ARG1, REAL_ARG2) BIND(C, NAME="foo")
    USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT,C_FLOAT
    IMPLICIT NONE
    INTEGER(C_INT) :: INT_ARG1
    REAL(C_FLOAT) :: REAL_ARG2
  END SUBROUTINE FOO
END INTERFACE


This results in


PROGRAM FORTRAN2C
INTERFACE
    SUBROUTINE FOO(INT_ARG1, REAL_ARG2) BIND(C, NAME="foo")
    USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_INT,C_FLOAT
    IMPLICIT NONE
    INTEGER(C_INT) :: INT_ARG1
    REAL(C_FLOAT) :: REAL_ARG2
  END SUBROUTINE FOO
END INTERFACE
INT_ARG1=1234
REAL_ARG2=5678.9
PRINT *,"passing integer ",INT_ARG1," and real ",REAL_ARG2
CALL FOO(INT_ARG1,REAL_ARG2)
END

We can then link successfully with any C object.

pgcc -c csub_a.c -o csub_a_pgi.o
gcc -c csub_a.c -o csub_a_gcc.o
icc -c csub_a.c -o csub_a_intel.o

pgfortran -o f2c_a_pgi  fmain_a.f90  csub_a_pgi.o
pgfortran -o f2c_a_gcc fmain_a.f90 csub_a_gcc.o
pgfortran -o f2c_a_intel fmain_a.f90 csub_a_intel.o

./f2c_a_pgi

 passing integer          1234  and real     5678.900    
The arguments passed are int_arg1=1234
 and real_arg2=5678.9 

./f2c_a_gcc

 passing integer          1234  and real     5678.900    
The arguments passed are int_arg1=1234
 and real_arg2=5678.9 

./f2c_a_intel

 passing integer          1234  and real     5678.900    
The arguments passed are int_arg1=1234
 and real_arg2=5678.9

Here's another example with more data types passed.

more fmain_b.f90 

       program fort2c
use fmain2csub
       logical*1              bool1
       character              letter1
       integer*4              numint1, numint2
       real                   numfloat1
       double precision       numdoub1
       integer*2              numshor1
call cfunc (bool1, letter1, numint1, numint2, & 
        numfloat1, numdoub1, numshor1)
write( *, 100) & 
        bool1, letter1, numint1, numint2, numfloat1, & 
        numdoub1, numshor1
100    format(1x,"bool1     =  ", L2,/, &          
        " letter1   =  ", A2,/,                 &
        " numint1   = ", I5,/,                  &
        " numint2   = ", I5,/,                  &
        " numfloat1 = ", F6.1,/,                &
        " numdoub1  = ", F6.1,/,                &
        " numshor1  = ", I5,/)
       end


more csub_b.c

#include <stdio.h>
#define TRUE 0xff
#define FALSE 0
void
cfunc( bool1, letter1, numint1, numint2, numfloat1,\
        numdoub1, numshor1)
   char    *bool1, *letter1;
   int     *numint1, *numint2;
   float   *numfloat1;
   double  *numdoub1;
   short   *numshor1;
{
   *bool1 = TRUE;
   *letter1 = 'v';
   *numint1 = 11;
   *numint2 = -44;
   *numfloat1 = 39.6 ;
   *numdoub1 = 39.2 ;
   *numshor1 = 981;

}

This time we'll create an interface record and put it in a separate module called fmain2csub_b

more fmain2csub_b_mod.f90

module fmain2csub_b
INTERFACE
    subroutine cfunc ( bool1, letter1, numint1, &
    numint2, numfloat1, numdoub1, numshor1) BIND(C,NAME="cfunc")
    use, intrinsic  :: iso_c_binding, only:C_CHAR,C_BOOL, &
        C_INT,C_FLOAT,C_DOUBLE,C_SHORT
    logical(C_BOOL) ::      bool1
    character(C_CHAR) ::    letter1
    integer(C_INT) ::       numint1, numint2
    real(C_DOUBLE) ::       numdoub1
    real(C_FLOAT) ::        numfloat1
    integer(C_SHORT) ::     numshor1
    end subroutine cfunc
END INTERFACE
end module fmain2csub_b

When we build with an interface module (like fmain2csub_b) we need to link in the interface module object file as well.


pgcc csub_b.c  -c  -o csub_b_pgi.o
gcc csub_b.c -c -o csub_b_gcc.o
icc csub_b.c -c -o csub_b_intel.o


pgfortran -c fmain2csub_b_mod.f90
pgfortran -o f2c_b_pgi fmain_b.f90 csub_b_pgi.o fmain2csub_b_mod.o
pgfortran -o f2c_b_gcc fmain_b.f90 csub_b_gcc.o fmain2csub_b_mod.o
pgfortran -o f2c_b_intel fmain_b.f90 csub_b_intel.o fmain2csub_b_mod.o
    


f2c_b_pgi

 bool1     =   T
 letter1 =  v
 numint1 =  11
 numint2 = -44
 numfloat1 = 39.6
 numdoub1  =   39.2
 numshor1  =   981
    

f2c_b_gcc

 bool1     =   T
 letter1   =   v
       numint1    = 11
       numint2    = -44
       numfloat1  = 39.6
 numdoub1  =   39.2
 numshor1  =   981
  

f2c_b_intel

 bool1     =   T
 letter1   = v
 numint1   = 11
 numint2   = -44
 numfloat1 = 39.6
 numdoub1  =   39.2
 numshor1  =   981

Here is the same main program and interface module but this time with an equivalent C++ routine. Note the use of -pgcpplibs in the link step.

more cppsub_b.cpp

#define TRUE 0xff
#define FALSE 0
extern "C" {
extern void
cfunc(char *bool1, char *letter1, int *numint1,int *numint2, float *numfloat1,
        double *numdoub1, short *numshor1)
{
   *bool1 = TRUE;
        *letter1 = 'v';
        *numint1 = 11;
        *numint2  = -44;
   *numfloat1 = 39.6 ;
   *numdoub1 = 39.2 ;
   *numshor1 = 981;

}
}


g++ -c cppsub_b.cpp -o cppsub_b_g++.o
icc -c cppsub_b.cpp -o cppsub_b_intel.o
pgcpp -c cppsub_b.cpp -o cppsub_b_pgi.o
pgfortran -c fmain2csub_b_mod.f90
pgfortran -o f2cpp_b_pgi fmain_b.f90 cppsub_b_pgi.o fmain2csub_b_mod.o -pgcpplibs
pgfortran -o f2cpp_b_g++ fmain_b.f90 cppsub_b_g++.o fmain2csub_b_mod.o -pgcpplibs
pgfortran -o f2cpp_b_intel fmain_b.f90 cppsub_b_intel.o fmain2csub_b_mod.o -pgcpplibs
f2cpp_b_pgi

 bool1     =   T
 letter1   =   v
 numint1   =    11
 numint2   =   -44
 numfloat1 =   39.6
 numdoub1  =   39.2
 numshor1  =   981


f2cpp_b_g++

 bool1     =   T
 letter1   =   v
 numint1   =    11
 numint2   =   -44
 numfloat1 =   39.6
 numdoub1  =   39.2
 numshor1  =   981


f2cpp_b_intel

 bool1     =   T
 letter1   =   v
 numint1   =    11
 numint2   =   -44
 numfloat1 =   39.6
 numdoub1  =   39.2
 numshor1  =   981
 

How do you call Fortran routines from a C/C++ routine?

Interfacing to Fortran subroutines and functions is very similar to what we did with ISO_C_BINDING, but there is no equivalent standard defined. When calling Fortran routines from C/C++, both the called Fortran program and the calling C/C++ programs need to be altered. The steps involve:

  • Create a C/C++ prototype for the Fortran program that defines the CALLED Fortran program's entry point and its arguments in the equivalent C/C++ data types. The entry point is usually the function/subroutine name in all lower case, with an appended underscore.
  • Use ISO_C_BINDING to declare the Fortran arguments as compatible C arguments. Instead of an interface record, you are defining the entry arguments to use the ISO_C_BINDING data types.
  • Initialize pgfortran internal tables. This is specific to pgfortran. Call pghpf_init() from the user's main program and any subsequent pgfortran interaction should be consistent.

Suppose you wish to call the following function in Fortran:

more forts_c.f90

subroutine forts ( bool1, letter1, numint1, numint2, numfloat1, numdoub1, numshor1 )
logical*1        :: bool1 
character*1      :: letter1 
integer          :: numint1 
integer          :: numint2 
real             :: numfloat1 
double precision :: numdoub1
integer*2        :: numshor1 
    bool1 = .true.
    letter1="v"
    numint1=123
    numint2=-456
    numdoub1=5432.1
    numfloat1=6789.0
    numshor1=53
    return
end

And you wish to call the Fortran routine from a C++ main program like the following:

more cmain_c.C

#include <iostream>
int main(int argc, char **argv)
{
   char          bool1;
   char          letter1;
   int           numint1, numint2;
   float         numfloat1;
   double        numdoub1;
   short         numshor1;
   int           i;
   for (i=0; i < argc; i++){
       std::cout << "main: command line arg " << i << " is " << argv[i] << std::endl;
   }
   forts(&bool1,&letter1,&numint1,&numint2,&numfloat1,
          &numdoub1,&numshor1);
   std::cout << "main: bool1=     " << (bool1?"TRUE":"FALSE") << std::endl;
   std::cout << "main: letter1=   " << letter1  << std::endl;
   std::cout << "main: numint1=   " << numint1  << std::endl;
   std::cout << "main: numint2=   " << numint2  << std::endl;
   std::cout << "main: numfloat1= " << numfloat1  << std::endl;
   std::cout << "main: numdoub1=  " << numdoub1  << std::endl;
   std::cout << "main: numshor1=  " << numshor1  << std::endl;
}

To make this work, we need to add/modify the following lines in each source file.

more cmain_c.C

#include <iostream> extern "C" { extern void forts_( char *, char *, int *,int *,float *, double *,short * ); } #if defined (_PGI_) extern "C" void pghpf_init(int *); static int zz = 0; #endif int main(int argc, char **argv) { char bool1; char letter1; int numint1, numint2; float numfloat1; double numdoub1; short numshor1; int i; #if defined (_PGI_) pghpf_init(&zz); #endif for (i=0; i < argc; i++){ std::cout << "main: command line arg " << i << " is " << argv[i] << std::endl; } forts_(&bool1,&letter1,&numint1,&numint2,&numfloat1, &numdoub1,&numshor1); std::cout << "main: bool1= " << (bool1?"TRUE":"FALSE") << std::endl; std::cout << "main: letter1= " << letter1 << std::endl; std::cout << "main: numint1= " << numint1 << std::endl; std::cout << "main: numint2= " << numint2 << std::endl; std::cout << "main: numfloat1= " << numfloat1 << std::endl; std::cout << "main: numdoub1= " << numdoub1 << std::endl; std::cout <<"main: numshor1= " << numshor1 << std::endl; } more forts_c.f90 subroutine forts ( bool1, letter1, numint1, numint2, numfloat1, numdoub1, numshor1 ) use, intrinsic :: iso_c_binding, only:C_CHAR,C_BOOL, & C_INT,C_FLOAT,C_DOUBLE,C_SHORT logical(C_BOOL) :: bool1 character(C_CHAR) :: letter1 integer(C_INT) :: numint1, numint2 real(C_DOUBLE) :: numdoub1 real(C_FLOAT) :: numfloat1 integer(C_SHORT) :: numshor1 bool1 = .true. letter1="v" numint1=123 numint2=-456 numdoub1=5432.1 numfloat1=6789.0 numshor1=53 return end pgfortran -c forts_c.f90 pgcpp -o c2f_pgi cmain_c.C forts_c.o -pgf90libs ./c2f_pgi c2f_pgi is a program main: command line arg 0 is c2f_pgi main: command line arg 1 is is main: command line arg 2 is a main: command line arg 3 is program main: bool1= TRUE main: letter1= v main: numint1= 123 main: numint2= -456 main: numfloat1= 6789 main: numdoub1= 5432.1 main: numshor1= 53 or ifort -c forts_c.f90 icc -o c2f_intel cmain_c.C forts_c.o ./c2f_intel c2f_intel is another program main: command line arg 0 is c2f_intel main: command line arg 1 is is main: command line arg 2 is another main: command line arg 3 is program main: bool1= TRUE main: letter1= v main: numint1= 123 main: numint2= -456 main: numfloat1= 6789 main: numdoub1= 5432.1 main: numshor1= 53
 

Can you link programs compiled with pgcpp and programs compiled with g++?

pgcpp has not been able to link with other C++ object files, because pgcpp uses a different name-mangling algorithm. That has changed with the 13.* release compilers. A new compiler, pgc++, uses the same algorithm as g++.

For example, here is a C++ main routine.

more  cppmain_d.cc

#include <iostream>
extern double do_triad(double *a, double *b, double *c, double *d, int len, int rep);
using namespace std;
    
int main(int argc, char** argv) {
 
const int length=10000;
  
double *a = new double[length];
double *b = new double[length];
double *c = new double[length];
double *d = new double[length];
    
for(int i=0; i< length; ++i)
a[i]=b[i]=c[i]=d[i]=1.0;
 
do_triad(a,b,c,d,length,2);
    
delete [] a;
delete [] b;
delete [] c;
delete [] d;

return 0;
}

And here is a procedure it calls

more  cppsub_d.cc

#include <iostream>
using namespace std;


double do_triad(double *a, double *b, double *c, double *d,
int len, int rep) {
int i,j;
{
for(j=0;j<rep;j++){
for(i=0;i<len;++i)
a[i]=b[i]+c[i]*d[i];
cerr << j << " of " << rep << endl;
}
}
return 0.0;
}

With pgcpp linking fails, but with pgc++ things are better.

g++ -c cppsub_d.cc
pgcpp -o pgcpp2g++  cppmain_d.cc cppsub_d.o

cppmain_d.cc:
cppmain_d.o: In function `main':
/home/tull/xfer/13.0/examples/C2C/./cppmain_d.cc:17: undefined reference to `do_triad__FPdN31iT5'
cppsub_d.o: In function `do_triad(double*, double*, double*, double*, int, int)':
cppsub_d.cc:(.text+0x9b): undefined reference to `std::cerr'
cppsub_d.cc:(.text+0xa0): undefined reference to `std::ostream::operator<<(int)'
cppsub_d.cc:(.text+0xad): undefined reference to `std::basic_ostream<char, std::char_traits
<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostreami
<char, std::char_traits<char> >&, char const*)'
cppsub_d.cc:(.text+0xba): undefined reference to `std::ostream::operator<<(int)'
cppsub_d.cc:(.text+0xbf): undefined reference to `std::basic_ostream<char, std::char_traits<
char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream
<char, std::char_traits<char> >&)'
cppsub_d.cc:(.text+0xc7): undefined reference to `std::ostream::operator
<<(std::ostream& (*)(std::ostream&))'
cppsub_d.o: In function `__static_initialization_and_destruction_0(int, int)':
cppsub_d.cc:(.text+0x113): undefined reference to `std::ios_base::Init::Init()'
cppsub_d.cc:(.text+0x118): undefined reference to `std::ios_base::Init::~Init()'



pgc++ -o pgc++2g++ cppmain_d.cc cppsub_d.o 
cppmain_d.cc:


./pgc++2g++ 
0 of 2
1 of 2
Click me