Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Problem with IntelIB-Basic, Intel MPI and I_MPI_FABRICS=tmi

$
0
0

Hi,

We have a small cluster (head node + 4 nodes with 16 cores) using Intel infiniband. This cluster is under CentOS 6.6 (with the kernel of CentOS 6.5).

On this cluster Intel Parallel Studio XE 2015 is installed. I_MPI_FABRICS is set per default to "tmi" only.

When I start a job (using torque+maui) on several nodes, for example this one:

#!/bin/bash
#PBS -N IMB-MPI1_intelmpi
#PBS -l walltime=2:00:00
#PBS -l nodes=3:ppn=4
cd $PBS_O_WORKDIR

export I_MPI_FABRICS=tmi

mpirun IMB-MPI1

The job is running fine without any problem.

Now I start a job on a node:

#!/bin/bash
#PBS -N IMB-MPI1_intelmpi
#PBS -l walltime=2:00:00
#PBS -l nodes=1:ppn=16
cd $PBS_O_WORKDIR

export I_MPI_FABRICS=tmi

mpirun IMB-MPI1

This job does not start and I get this message:

can't open /dev/ipath, network down
tmi fabric is not available and fallback fabric is not enabled

Is it normal?

If I set as default I_MPI_FABRICS=dapl, I don't have this problem at all.

How can I solve that?

Best regards,

Guillaume


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>