Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

intel mpi cross os launch error

$
0
0

Env:

node1 : window 10                             (192.168.137.1)

node2 : debian8  virtual machine.      (192.168.137.3)

 

test app: the test.cpp included with intel mpi package

 

1,  Launch from windows side(node1),  1 process (just node 1):   

mpiexec -demux select -bootstrap=service -genv I_MPI_FABRICS=shm:tcp -n 1 -host localhost test

get output:

node1:

 

Hello world: rank 0 of 1 running on DESKTOP-J4KRVVD

2,  Launch from windows side(node1),  1 process (just node 2):   

mpiexec -demux select -bootstrap=service -genv I_MPI_FABRICS=shm:tcp -host 192.168.137.3 -hostos linux -n 1 -path /opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/test test

get output:

node1:

Hello world: rank 0 of 1 running on vm-build-debian8

3, Launch from windows side(node1),  two processes(1 at node1, 1 at node2):   

mpiexec -demux select -bootstrap=service -genv I_MPI_FABRICS=shm:tcp -host 192.168.137.3 -hostos linux -n 1 -path /opt/intel/compilers_and_libraries_2017.2.174/linux/mpi/test test : -n 1 -host localhost test

get error:

node1:

rank = 1, revents = 29, state = 1
Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 2988: (it_plfd->revents & POLLERR) == 0
internal ABORT - process 0

 

node2:

[hydserv@vm-build-debian8] stdio_cb (../../tools/bootstrap/persist/persist_server.c:170): assert (!closed) failed
[hydserv@vm-build-debian8] HYDT_dmxu_poll_wait_for_event (../../tools/demux/demux_poll.c:76): callback returned error status
[hydserv@vm-build-debian8] main (../../tools/bootstrap/persist/persist_server.c:339): demux engine error waiting for event

 

If i try to turn on verbose output with -v or -genv I_MPI_HYDRA_DEBUG=on,  even test 2 will fail with errors below,  so don't know what's wrong?   or how to find out what's wrong?      

node1:

[mpiexec@DESKTOP-J4KRVVD] STDIN will be redirected to 1 fd(s): 4

[mpiexec@DESKTOP-J4KRVVD] ..\hydra\utils\sock\sock.c (420): write error (Unknown error)
[mpiexec@DESKTOP-J4KRVVD] ..\hydra\tools\bootstrap\persist\persist_launch.c (52): assert (sent == hdr.buflen) failed
[mpiexec@DESKTOP-J4KRVVD] ..\hydra\tools\demux\demux_select.c (103): callback returned error status
[mpiexec@DESKTOP-J4KRVVD] ..\hydra\pm\pmiserv\pmiserv_pmci.c (501): error waiting for event
[mpiexec@DESKTOP-J4KRVVD] ..\hydra\ui\mpich\mpiexec.c (1147): process manager error waiting for completion

node2:

[hydserv@vm-build-debian8] stdio_cb (../../tools/bootstrap/persist/persist_server.c:170): assert (!closed) failed
[hydserv@vm-build-debian8] HYDT_dmxu_poll_wait_for_event (../../tools/demux/demux_poll.c:76): callback returned error status
[hydserv@vm-build-debian8] main (../../tools/bootstrap/persist/persist_server.c:339): demux engine error waiting for event

 


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>