Technical News from The Portland Group

Using Microsoft MPI with PGI Workstation

In the previous issue of PGInsider we explored using Microsoft MPI (MSMPI) from within PGI Visual Fortran (PVF). In this issue we'll use PGI Workstation to take a behind-the-scenes look at how you can build, launch and debug MSMPI programs directly from the command line.

We'll start by building an MSMPI application using the PGI Fortran 95 compiler. We'll then walk through the steps of running and debugging it locally and on a cluster.

Don't currently have PGI compilers on Windows? Download PGI Workstation today and try it out with a free 14 day evaluation license.

Choose Your Environment

PGI Workstation compilers and tools are native Windows applications. They are equally at home in the Windows Command Prompt (cmd) environment and the Cygwin bash shell that ships with PGI Workstation. These environments are largely interchangeable (exceptions are noted) and we'll work with each of them in this article.

The PGI Workstation installation process creates shortcuts that launch these shells from the Start menu with the environment pre-configured to use the PGI compilers and tools. From Start | All Programs:

bash shell: PGI Workstation | PGI Workstation
cmd prompt: PGI Workstation | PGI Workstation Tools | PGI Command Prompt

Build an MSMPI Application

Let's use the following MPI source file, prog.f90:

program BasicMPI

  include 'mpif.h'
  integer iStatus, iRank, iRet
  
  call MPI_Init(iRet)
  call MPI_Initialized(iStatus, iRet)
  call MPI_Comm_rank(MPI_COMM_WORLD, iRank, iRet)

  print *, "Rank:", iRank

  call MPI_Finalize(iRet)

end

Compile this file with the PGI Fortran compiler:

PGI$ pgfortran -g prog.f90

PGF90-S-0017-Unable to open include file: mpif.h (prog.f90: 3)
  0 inform,   0 warnings,   1 severes, 0 fatal for basicmpi

The build error occurs because the compiler does not know that we're trying to use MSMPI. Add the -Mmpi=msmpi flag to the compilation line and try again.

PGI$ pgfortran -g -Mmpi=msmpi prog.f90

If you've already installed MSMPI on your system, prog.f90 should compile without errors. If you haven't installed MSMPI yet, you can download the HPC Pack 2008 SDK directly from Microsoft. Installing this package provides the MSMPI headers, libraries and tools you'll need to build your application.

Run and Debug an MSMPI Application Locally

Run your application from the command line:

PGI$ prog
 Rank:            0

The application ran in serial mode. Because this is an MPI application, though, we can also run it using multiple processes. MSMPI applications can be run locally using mpiexec. Use the -n option to specify the number of processes to use when running the application. Here we'll use four:

PGI$ mpiexec -n 4 prog
 Rank:            2
 Rank:            1
 Rank:            3
 Rank:            0

Debugging the application using the PGI debugger is also straightforward. Simply invoke PGDBG with the -mpi option:

PGI$ pgdbg -mpi -n 4 prog

The above command launches the PGDBG graphical user interface (GUI). On start up, it will look like this:

PGDBG Startup Window

Select prog.f90 from the File drop-down box to display the source file. Let's set a breakpoint on the print line:

PGDBG Break Point

When an MPI program is loaded into pgdbg, it has already been run to an internal breakpoint location. To start debugging, you'll use Continue (select Cont on the Control menu or click the Continue button on the menu bar).

Run and Debug an MSMPI Application on a Cluster

If you are working on a Windows HPC Server 2008 cluster, you can run and debug your MSMPI application on both the head and compute nodes.

To continue the hands-on portion in the remainder of this article, you'll need a Microsoft HPC Server 2008 cluster and a PGI CDK license. Trial CDK licenses are available by submitting a PGI CDK Evaluation Request.

Microsoft's HPC Job Manager is the gatekeeper for running applications on Windows clusters. It's accessible from the Start | All Programs menu:

Microsoft HPC Pack | HPC Job Manager

You can use the Job Manager to launch and track jobs. You can also interact with the Job Manager directly from a Windows Command Prompt. We'll use this command-line interface to launch our application. Note that at this time the Job Manager cannot be used from within a Cygwin bash shell.

Open a PGI Command Prompt window. The executable that we'll invoke to launch our application is job.exe. Use the /? option to display help information about job.exe:

CMD> job /?
Usage
    job {operator} [options] [arguments]
Where
    operator:
        /? or /help   - Display this help message
        add           - Add a new task to an existing job.
        cancel        - Cancels a pending or running job.
        clone         - Clones a job
        list          - List jobs in the cluster
        listtasks     - List tasks on the cluster for a specific job
        modify        - Modifies an existing job
        new           - Creates a new job within the cluster
        requeue       - Requeue a given job.
        submit        - Submit a job to the cluster
        view          - View details of a Job

To launch a job we'll use the submit operator. You can use /? as an argument to job submit to see all of the options that can be used with the submit operator.

We'll start with just a couple of the submit options:

        /numcores            - The number of cores required by this job; the
                               job will run on at least the Minimum and no
                               more than the Maximum
        /workdir             - The working directory to be used during
                               execution for this task

For the /numcores option you will need to determine how many cores are available on your cluster. The Job Manager restricts the number of cores you can request to the maximum number of cores available. Use the cluscfg tool to find out information about your cluster including the number of cores. For example:

CMD> cluscfg view
Cluster name                    : hpc-head
Version                         : 2.0.1551.0
Total number of nodes           : 5
    Ready nodes                 : 5
    Offline nodes               : 0
    Draining nodes              : 0
    Unreachable nodes           : 0
Total number of cores           : 8
    Busy cores                  : 0
    Idle cores                  : 8
    Offline cores               : 0
Total number of jobs            : 2
    Configuring jobs            : 0
    Submitted jobs              : 0
    Validating jobs             : 0
    Queued jobs                 : 0
    Running jobs                : 0
    Finishing jobs              : 0
    Finished jobs               : 2
    Canceling jobs              : 0
    Canceled jobs               : 0
    Failed jobs                 : 0
Total number of tasks           : 2
    Configuring tasks           : 0
    Submitted tasks             : 0
    Queued tasks                : 0
    Running tasks               : 0
    Finished tasks              : 2
    Canceling tasks             : 0
    Canceled tasks              : 0
    Failed tasks                : 0

We have eight cores available on five nodes; we'll use all eight cores in our example.

For the /workdir option, you can provide the full path to the application's working directory. This path must be in Uniform Naming Convention (UNC) format. UNC names have the following general form:

\\computer name\folder1\folder2\file

We need to make sure of one more thing before we'll be ready to run. For an application to run on a cluster, it must be located in a directory that has been designated as shared. Use Windows Explorer to select the folder to share and right-click on it. Select the Share option to open the File Sharing dialog box. Click the Share button. Your directory will now have a Sharing icon associated with it:

Shared Folder

Now you are ready to run your program on the cluster. Adjust the number of cores and working directory path in the following command according to your system configuration:

CMD> job submit /numcores:8 /workdir:\\hpc-head\SHARED\example mpiexec prog
Job has been submitted. ID: 3.

Check the status of your job in the Job Manager:

Job Manager Window

Program output can be viewed directly in the Job Manager. To do so, start by selecting a job (upper pane) and task (lower pane) in the Job Manager. Use Action | Task Actions | View Task to open the Task Properties dialog. Program output will appear in the Output pane.

Task Output Window

You can also use the /stdout option to job submit to specify a file for standard output.

The PGI debugger can be launched in conjunction with the Job Manager so you can debug your application when it is running on the cluster. The command line you'll use is similar to the one you used when launching without debugging:

CMD> pgdbg -pgserv -mpi:job.exe submit /numcores:8
/workdir:\\hpc-head\SHARED\example mpiexec prog

The debugging experience on a cluster is very similar to the local MPI debugging experience although you'll continue to use the Job Manager to track the status of jobs. Note that pgdbg cannot begin debugging a job that has been queued by the Job Manager; if this situation occurs it will appear that pgdbg has been suspended. To begin debugging, unqueue the job.

Good luck and don't forget to let us know how you're doing. Log on to the PGI User Forum or send questions to PGI Technical Support.