PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

MPICH2 installation issues

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
Tuan



Joined: 11 Jun 2009
Posts: 233

PostPosted: Mon May 31, 2010 3:54 pm    Post subject: MPICH2 installation issues Reply with quote

I have this problem. I have just installed PGI CDK 10.5 on our cluster. One on a single machine and one on a single machine and one on cluster. They are all x86-64 machines, using ssh.

When I test on the single-machine cluster. MPICH seemt to work, but MPICH2 only print out "Alarm clock" message.

Quote:
Check MPI installation
MPICH1 first
-- 32-bit
hello - I am process 0 host cardiac.bi
hello - I am process 1 host cardiac.bi
hello - I am process 2 host cardiac.bi
hello - I am process 3 host cardiac.bi
-- 64-bit
hello - I am process 0 host cardiac.bi
hello - I am process 1 host cardiac.bi
hello - I am process 2 host cardiac.bi hello - I am process 3 host cardiac.bi

MPICH2 second
-- 32-bit
./script_mpitest: line 35: 8969 Alarm clock mpiexec -np 4 ./mpihello_mpich2
./script_mpitest: line 37: 8970 Alarm clock mpdallexit
-- 64-bit
./script_mpitest: line 48: 8986 Alarm clock mpiexec -np 4 ./mpihello_mpich2
./script_mpitest: line 50: 8987 Alarm clock mpdallexit


When I test on the other cluster. MPICH works, MPICH2 also work, but it seem to run on a single machine only. I follow the guideline on PGI CDK installation notes.

Quote:
-e Check MPI installation
-e MPICH1 first
-- 32-bit
hello - I am process 0 host nfat.binf.
hello - I am process 1 host dhpr
hello - I am process 2 host fkbp
hello - I am process 3 host nfkb
-- 64-bit
hello - I am process 0 host nfat.binf.
hello - I am process 1 host dhpr
hello - I am process 2 host fkbp
hello - I am process 3 host nfkb
-e MPICH2 second
-- 32-bit
An mpd is already running with console at /tmp/mpd2.console_minhtuan on nfat.binf.gmu.edu.
Start mpd with the -n option for a second mpd on same host.
hello - I am process 2 host nfat.binf.
hello - I am process 3 host nfat.binf.
hello - I am process 0 host nfat.binf.
hello - I am process 1 host nfat.binf.
-- 64-bit
hello - I am process 0 host nfat.binf.
hello - I am process 2 host nfat.binf. hello - I am process 3 host nfat.binf.

hello - I am process 1 host nfat.binf.


Could someone give me a hint to resolve this. If you need further information, please let me know.

Tuan
Back to top
View user's profile
hongyon



Joined: 19 Jul 2004
Posts: 551

PostPosted: Fri Jun 10, 2011 2:53 pm    Post subject: Reply with quote

On multiple nodes cluster:

If mpd already started, then you will need to exit first(mpdallexit).
It may have started with only one node.

Then create mpd.host files containing all slave nodes.
mpd.hosts:
dhpr
fkbp
nfkb

Run following commands in the directory you have created mpd.hosts.
%mpdboot --totalnum=4

#check
%mpdtrace


Running on just single node cluster:
Again make sure there is not existing(mpdallexit). Then run:

%mpd


Hongyon
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group