Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

MPI performance problem on inter-switch connection

$
0
0

Hi,

I have a cluster with 32 machines. The first 25 machines are on the first rack and the rest 7 machines are on the second rack.
Each rack has a 1Gbps Ethernet switch.

I run a MPI application which uses 32 machines (1 process per host machine).
When I used the network performance benchmark tool like 'iperf' to measure the network speed between the machines, there is no problem (all point-to-point connection within 32 machines can exploit the full bandwidth).

In my application (MPI_Send/MPI_Recv), each mpi process sends a few 4MB sized data to the other machines. (so it is not the message size problem)
I found that the communication speed between the first 25 machines and the next 7 machines was very poor (~ 10 ~ 20 MB/sec)
(The communication speed within the first 25 machines and the next 7 machines are fast; 100 ~ 110 MB/sec)

 

What is the possible cause here? Is the latency killing it?
What can I do here to improve the performance?

Is there any suggested optimization?


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>