Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

-perhost not working with IntelMPI v.5.0.1.035

$
0
0

-perhost option does not work as expected with IntelMPI v.5.0.1.035, though it does work with  IntelMPI v.4.1.0.024:

 

$ qsub -I -lnodes=2:ppn=16:compute,walltime=0:15:00
qsub: waiting for job 5731.hpc-class.its.iastate.edu to start
qsub: job 5731.hpc-class.its.iastate.edu ready

$ mpirun -n 2 -perhost 1 uname -n
hpc-class-40.its.iastate.edu
hpc-class-40.its.iastate.edu

$ export I_MPI_ROOT=/shared/intel//impi/4.1.0.024
$ PATH="${I_MPI_ROOT}/intel64/bin:${PATH}"; export PATH
$ mpirun -n 2 -perhost 1 uname -n
hpc-class-40.its.iastate.edu
hpc-class-39.its.iastate.edu

 

I also ran the same commands with I_MPI_HYDRA_DEBUG set to 1 (see attached files mpirun-perhost.txt and mpirun-perhost-4.1.0.024.txt). Note that the first two lines of the output in mpirun-perhost.txt suggest that -perhost works (two different hostnames are printed), but at the end it's still printing the same hostname twice.

 

In mpirun-perhost.txt I_MPI_PERHOST said to be allcores. In another run (see attached file mpirun-perhost-PERHOST1.txt) I set I_MPI_PERHOST to 1, however still at the end only one hostname is printed twice.

 

To prove that both hostnames are available, I ran the command with 17 processes (there are 16 cores on a node):

[grl@hpc-class-39 ~]$ mpirun -n 17 uname -n | uniq -c
     16 hpc-class-39.its.iastate.edu
      1 hpc-class-38.its.iastate.edu

 

Comparing mpirun-perhost.txt and mpirun-perhost-4.1.0.024.txt one can see the following difference:

mpirun-perhost.txt :

     Proxy information:
    *********************
      [1] proxy: hpc-class-40.its.iastate.edu (16 cores)
      Exec list: uname (2 processes);

mpirun-perhost-4.1.0.024.txt :

    Proxy information:
    *********************
      [1] proxy: hpc-class-39.its.iastate.edu (1 cores)
      Exec list: uname (1 processes);

      [2] proxy: hpc-class-38.its.iastate.edu (1 cores)
      Exec list: uname (1 processes);

So, somehow exec list in the IntelMPI v.5.0.1.035 run does not take into account -perhost value.

 

Can anyone reproduce the problem?


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>