Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Disconnect VPN causes parallel computing with Intel MPI to stop

$
0
0

Hi:

Our software product is based on Intel MPI for parallel computing on Windows.

Recently many of our customers encounter this error. Due to COVID-19, they all work at home with VPN connection to the office.

They run our software for parallel computing at home, but when they disconnect VPN, the parallel computing is stopped at the same time.

 

I can reproduce the error by the following steps:

1. Run IMB-MPI1.exe with the command:   mpiexec.exe -localonly -n 4 C:\test\IMB-MPI1.exe

2. While IMB-MPI1.exe is still running, I disable any of the network interfaces (I have 3 NICs, 2 are created by VMware, 1 is physical NIC.) and got the following errors

       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
            0         1000         0.05         0.06         0.06
            1         1000         0.77         0.79         0.78
            2         1000         0.86         0.87         0.86
            4         1000         0.86         0.90         0.88
            8         1000         0.78         0.80         0.79
           16         1000         1.01         1.15         1.08
           32         1000         1.18         1.22         1.20
           64         1000         0.87         0.90         0.88
          128         1000         1.31         1.36         1.34
          256         1000         1.29         1.35         1.31
          512         1000         1.52         1.57         1.55
         1024         1000         1.45         1.47         1.46
         2048         1000         2.57         2.77         2.67
         4096         1000         3.60         4.05         3.88
         8192         1000         4.99         5.31         5.13
        16384         1000         8.44         8.74         8.52
        32768         1000        14.06        14.34        14.14
        65536          640        35.06        35.86        35.47
       131072          320        59.00        67.31        63.18
       262144          160       156.66       167.80       161.57
       524288           80       869.78       896.27       880.66
      1048576           40      2402.85      2564.92      2484.02
      2097152           20      4692.22      4907.47      4789.06
[mpiexec@PCAcer144006] ..\hydra\pm\pmiserv\pmiserv_cb.c (863): connection to proxy 0 at host PCAcer144006 failed
[mpiexec@PCAcer144006] ..\hydra\tools\demux\demux_select.c (103): callback returned error status
[mpiexec@PCAcer144006] ..\hydra\pm\pmiserv\pmiserv_pmci.c (520): error waiting for event
[mpiexec@PCAcer144006] ..\hydra\ui\mpich\mpiexec.c (1157): process manager error waiting for completion

C:\Program Files\Intel MPI 2018\x64>

 

Is there any workaround ?  Thank you

 


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>