Hello,
I meet some problems to use two NICs with Intel MPI (2018.3.222). I have two mellnox IB FDR NICs. So I try to bind process to different NICs with the following script
[ $MPI_LOCALRANKID -eq 0 ]; then
IB_NUMBER=0
else
IB_NUMBER=1
fi
# export I_MPI_FABRICS=shm:ofa I_MPI_OFA_NUM_ADAPTERS=1 I_MPI_OFA_ADAPTER_NAME=mlx4_${IB_NUMBER}
export FI_VERBS_IFACE=mlx4_${IB_NUMBER}
export I_MPI_HYDRA_IFACE=mlx4_${IB_NUMBER}
echo $IB_NUMBER
./IMB-MPI1 -include Sendrecv Allreduce
MPI startup(): shm:ofa fabric is unknown or has been removed from the product, please use ofi or shm:ofi instead [0] MPI startup(): libfabric version: 1.8.0a1-impi
[0] MPI startup(): libfabric provider: sockets
[0] MPI startup(): Rank Pid Node name Pin cpu [0] MPI startup(): 0 51688 i1 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,52,53,54,55, 56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77} [0] MPI startup(): 1 51689 i1 {26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,78 ,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103} [0] MPI startup(): 2 27108 i2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,52,53,54,55, 56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77} [0] MPI startup(): 3 27109 i2 {26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,78
,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103}
[0] MPI startup(): 4 53516 i3 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,52,53,54,55,
56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77}
[0] MPI startup(): 5 53517 i3 {26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,78
,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103}
[0] MPI startup(): 6 44894 i4 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,52,53,54,55,
56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77}
[0] MPI startup(): 7 44895 i4 {26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,78
,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103}
[0] MPI startup(): I_MPI_ROOT=/opt/spack/spack-avx512/opt/spack/linux-debian9-x86_64/gcc-8.2.0/intel-mpi-2019.3.199-ooicvtdn7kvu2yr7dzoggbn4tizn5eri/compilers_and_libraries_2019.3.199/linux/mpi
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_HYDRA_IFACE=mlx4_0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_FABRICS=shm:ofa
[0] MPI startup(): I_MPI_DEBUG=5
[0] MPI startup(): I_MPI_OFA_NUM_ADAPTERS variable has been removed from the product, its value is ignored
[0] MPI startup(): I_MPI_OFA_ADAPTER_NAME variable has been removed from the product, its value is ignored
[0] MPI startup(): I_MPI_OFA_NUM_ADAPTERS environment variable is not supported.
[0] MPI startup(): I_MPI_OFA_ADAPTER_NAME environment variable is not supported.
[0] MPI startup(): Similar variables:
I_MPI_ADJUST_SCATTER_NODE
[0] MPI startup(): To check the list of supported variables, use the impi_info utility or refer to https://software.intel.com/en-us/mpi-library/documentation/get-started.
#------------------------------------------------------------
# Intel(R) MPI Benchmarks 2019 Update 3, MPI-1 part
#------------------------------------------------------------
# Date : Sun Jun 16 14:52:23 2019
# Machine : x86_64
# System : Linux
# Release : 4.9.0-7-amd64
# Version : #1 SMP Debian 4.9.110-1 (2018-07-05)
# MPI Version : 3.1
# MPI Thread Environment:
Results are same as single FDR and far away from 56Gb*2 IB bandwidth.
#----------------------------------------------------------------------------- [124/1982]
# Benchmarking Sendrecv
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
65536 640 14.86 14.86 14.86 8822.22
131072 320 27.31 27.32 27.31 9596.69
262144 160 57.10 57.13 57.12 9177.00
524288 80 145.31 145.60 145.45 7201.97
1048576 40 299.75 301.67 300.71 6951.85
2097152 20 577.48 586.55 582.01 7150.82
4194304 10 1373.00 1388.52 1380.76 6041.42
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
65536 640 62.04 62.13 62.08 2109.51
131072 320 95.83 95.97 95.91 2731.60
262144 160 183.33 184.00 183.74 2849.39
524288 80 356.31 359.86 358.58 2913.82
1048576 40 11500.00 11517.56 11510.18 182.08
I wonder how should I bind process to different NICs. And is there better way to make full use of two NICs, such as one process utilize two NICs simultaneously?