Invalid communicator, error stack

Questions on using the PGI Compilers and Tools
Post Reply
shervin.s
Posts: 3
Joined: Jan 25 2019

Invalid communicator, error stack

Post by shervin.s » Mon Jan 13, 2020 7:54 am

Hi,

I am using 19.4 version to build and run a cpp code. The MPI is 3.1.3, the one that PGI installs itself. The code compiles fine but during the run, it dumps out the following error. The code successfully ran with the 18.10 version.

Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(122): MPI_Comm_rank(comm=0x883390, rank=0x7ffc59754acc) failed
PMPI_Comm_rank(75).: Invalid communicator
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=671705093
:
system msg for write_line failure : Bad file descriptor
Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(122): MPI_Comm_rank(comm=0x883390, rank=0x7ffda962d35c) failed
PMPI_Comm_rank(75).: Invalid communicator
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=470378501
:
system msg for write_line failure : Bad file descriptor
Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(122): MPI_Comm_rank(comm=0x883390, rank=0x7ffcb2711c6c) failed
PMPI_Comm_rank(75).: Invalid communicator
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=940140549
:
system msg for write_line failure : Bad file descriptor
Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(122): MPI_Comm_rank(comm=0x883390, rank=0x7ffd751fb35c) failed
PMPI_Comm_rank(75).: Invalid communicator
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=403269637
:
system msg for write_line failure : Bad file descriptor
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[19919,1],0]
Exit code: 5

mkcolg
Posts: 8255
Joined: Jun 30 2004

Re: Invalid communicator, error stack

Post by mkcolg » Mon Jan 13, 2020 8:49 am

Hi shervin.s,

While I can't be sure this what's happening here, often this means that there's a mismatch between the version you're building with version of the runtime.

Can you please check that your LD_LIBRARY_PATH is set to use the libraries from the same OpenMPI that you used to build?

-Mat

shervin.s
Posts: 3
Joined: Jan 25 2019

Re: Invalid communicator, error stack

Post by shervin.s » Mon Jan 13, 2020 12:46 pm

Hi Mat,

Thank you for the prompt response. Actually, I checked the libraries and binaries for mismatch before posting this. Unfortunately, that is not the problem. :(

mkcolg
Posts: 8255
Joined: Jun 30 2004

Re: Invalid communicator, error stack

Post by mkcolg » Mon Jan 13, 2020 1:37 pm

Other possibilities are that you're using a different mpirun/mpiexec than what's included with the one we ship or including mpi.h header file from a different install.

Again, the only time I've seen this error before and from what I can tell searching the web, this error typically occurs when there's a mismatch someplace, so I'd double check that the driver (mpicxx), include directories (if you have a -I<dir> flag) are correct, mpirun, and the LD_LIBRARY_PATH all point to the same install.

If that all checks out, did the full application get rebuilt with 19.4, including libraries? Maybe there's an old object file that was built with 18.10?

-Mat

shervin.s
Posts: 3
Joined: Jan 25 2019

Re: Invalid communicator, error stack

Post by shervin.s » Wed Jan 15, 2020 7:31 am

Hi Mat,

I just wanted to let you know I fixed the bug. It was a mismatch for the impi library from pgi and apparently a sub-library from hdf5. Anyway, it is fixed now. Thanks for your help, Shervin

Post Reply