Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

MPI Performance issue on bi-directional communication

$
0
0

Hi,

(I attached the performance measurement program written in C++)

I am experiencing performance issue during bi-directional MPI_Send/MPI_Recv operations.

The program runs two threads; (One for MPI_Send and the other for MPI_Recv).
- MPI_Recv receives any data from any source.
- MPI_Send sends data to the other nodes one at a time (starting from its own rank, rank+1, ..., 0, ... rank -1)

You can compile the attached file as follows:
$ mpiicpc -O3 -m64 -std=c++11 -mt_mpi -qopenmp ./mpi-test.cpp -o mpi-test

You can test it as follows:
$ mpiexec.hydra -genv I_MPI_PERHOST 1 -genv I_MPI_FABRICS tcp -n 2 -machinefile ./machine_list /home/TESTER/mpi-test
rank[0] --> rank[0]     BW=2060.27 [MB/sec]
rank[0] --> rank[1]     BW=56.38 [MB/sec]
rank[0] BW=219.21 [MB/sec]
rank[1] BW=217.20 [MB/sec]

$ mpiexec.hydra -genv I_MPI_PERHOST 1 -genv I_MPI_FABRICS tcp -n 4 -machinefile ./machine_list /home/TESTER/mpi-test
rank[0] --> rank[0]     BW=2050.59 [MB/sec]
rank[0] --> rank[1]     BW=112.35 [MB/sec]
rank[0] --> rank[2]     BW=57.19 [MB/sec]
rank[0] --> rank[3]     BW=109.64 [MB/sec]
rank[0] BW=218.28 [MB/sec]
rank[1] BW=219.17 [MB/sec]
rank[2] BW=220.75 [MB/sec]
rank[3] BW=221.17 [MB/sec]
 

What I am observing is that when the data transfer from rank-A to rank-B and from rank-B to rank-A occur simultaneously, the performance drops significantly (almost to half).
The cluster machines use Cent OS 7, 1gbps ethernet that supports full duplex transimission mode.

How can I resolve this issue?

- Does Intel MPI support full-duplex transmission mode between two ranks?

AttachmentSize
Downloadtext/x-c++srcmpi-test.cpp2.5 KB

Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>