Hi:
Our software product is based on Intel MPI for parallel computing on Windows.
Recently many of our customers encounter this error. Due to COVID-19, they all work at home with VPN connection to the office.
They run our software for parallel computing at home, but when they disconnect VPN, the parallel computing is stopped at the same time.
I can reproduce the error by the following steps:
1. Run IMB-MPI1.exe with the command: mpiexec.exe -localonly -n 4 C:\test\IMB-MPI1.exe
2. While IMB-MPI1.exe is still running, I disable any of the network interfaces (I have 3 NICs, 2 are created by VMware, 1 is physical NIC.) and got the following errors
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.05 0.06 0.06
1 1000 0.77 0.79 0.78
2 1000 0.86 0.87 0.86
4 1000 0.86 0.90 0.88
8 1000 0.78 0.80 0.79
16 1000 1.01 1.15 1.08
32 1000 1.18 1.22 1.20
64 1000 0.87 0.90 0.88
128 1000 1.31 1.36 1.34
256 1000 1.29 1.35 1.31
512 1000 1.52 1.57 1.55
1024 1000 1.45 1.47 1.46
2048 1000 2.57 2.77 2.67
4096 1000 3.60 4.05 3.88
8192 1000 4.99 5.31 5.13
16384 1000 8.44 8.74 8.52
32768 1000 14.06 14.34 14.14
65536 640 35.06 35.86 35.47
131072 320 59.00 67.31 63.18
262144 160 156.66 167.80 161.57
524288 80 869.78 896.27 880.66
1048576 40 2402.85 2564.92 2484.02
2097152 20 4692.22 4907.47 4789.06
[mpiexec@PCAcer144006] ..\hydra\pm\pmiserv\pmiserv_cb.c (863): connection to proxy 0 at host PCAcer144006 failed
[mpiexec@PCAcer144006] ..\hydra\tools\demux\demux_select.c (103): callback returned error status
[mpiexec@PCAcer144006] ..\hydra\pm\pmiserv\pmiserv_pmci.c (520): error waiting for event
[mpiexec@PCAcer144006] ..\hydra\ui\mpich\mpiexec.c (1157): process manager error waiting for completion
C:\Program Files\Intel MPI 2018\x64>
Is there any workaround ? Thank you