I am finding that MPI_Abort is not killing processes on remote nodes when I_MPI_FABRICS=shm:tmi. The attached PBS job works correctly when the default fabric (shm:dapl) is used, but fails to abort cleanly when shm:tmi is used. Any help with overcoming this problem would be much appreciated.
↧
MPI_Abort fails with TMI fabric
↧