We are trying to run parallel on a single node using Intel MPI - 2018.0.124 and getting the following error.
..\hydra\pm\pmiserv\pmiserv_cb.c (834): connection to proxy 0 at host XXX-NNNN failed
..\hydra\tools\demux\demux_select.c (103): callback returned error status
..\hydra\pm\pmiserv\pmiserv_pmci.c (507): error waiting for event
..\hydra\ui\mpich\mpiexec.c (1148): process manager error waiting for completion
We have checked hydra-service status and found that to be working.
mpiexec also seems to be working ok.
mpiexec -n 2 hostname - returns the localhost name
mpiexec -validate - returns success
We have also checked that the hydra service is running the version we want and it is the only version in the machine.
Is there anything we can do to check why the runs fail?
Thanks!