Hi,
I have trouble getting to run the test program linux/mpi/test/test.c included in the Intel MPI package.
My trouble only occurs on machines equipped with AMD processors.
Specially, I've installed MPI in my home directory (which is NSF-mounted on both host1 and host2 machines) and then compiled test.c using mpicc
mpicc -show -o test test.c gcc -o 'test''test.c' -I/home/jyli/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/include -L/home/jyli/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib/release_mt -L/home/jyli/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /home/jyli/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib/release_mt -Xlinker -rpath -Xlinker /home/jyli/intel/compilers_and_libraries_2017.2.174/linux/mpi/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2107.0.0/intel64/lib/release_mt -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/2017.0.0/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread
I checked that the machines connect to each other fine
mpirun -ppn 1 -n 2 -hosts host1,host2 hostname host1 host2
However, when I run the test program, I encountered the following errors:
mpirun -ppn 1 -n 2 -hosts host1,host2 ./test [0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2 Build 20170125 (id: 16752) [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation. All rights reserved. [0] MPI startup(): Multi-threaded optimized library =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 5582 RUNNING AT host2 = EXIT CODE: 11 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 5582 RUNNING AT host2 = EXIT CODE: 11 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== Intel(R) MPI Library troubleshooting guide: https://software.intel.com/node/561764 ===================================================================================
I then attached gdb to the core file generated, and here's the trace
gdb ./test core (gdb) bt #0 __GI_____strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group=<optimized out>, loc=0x7faf932e2060 <_nl_global_locale>) at ../stdlib/strtol_l.c:298 #1 0x00007faf9368e11a in atoi (__nptr=<optimized out>) at /usr/include/stdlib.h:286 #2 i_mpi_numa_nodes_compare (a=0x0, b=0x0) at ../../src/mpid/ch3/src/mpid_init.c:62 #3 0x00007faf92f5b419 in msort_with_tmp (p=0x7fff5c532aa0, b=0xac7b30, n=2) at msort.c:83 #4 0x00007faf92f5b6cc in msort_with_tmp (n=2, b=0xac7b30, p=0x7fff5c532aa0) at msort.c:45 #5 __GI_qsort_r (b=0xac7b30, n=2, s=8, cmp=0x7faf9368e100 <i_mpi_numa_nodes_compare>, arg=<optimized out>) at msort.c:297 #6 0x00007faf936911af in MPID_nem_impi_create_numa_nodes_map () at ../../src/mpid/ch3/src/mpid_init.c:1305 #7 0x00007faf93692284 in MPID_Init (argc=0x0, argv=0x0, requested=10, provided=0x0, has_args=0x7faf932e2060 <_nl_global_locale>, has_env=0xac7db1) at ../../src/mpid/ch3/src/mpid_init.c:1732 #8 0x00007faf9362872b in MPIR_Init_thread (argc=0x0, argv=0x0, required=10, provided=0x0) at ../../src/mpi/init/initthread.c:717 #9 0x00007faf93615e2b in PMPI_Init (argc=0x0, argv=0x0) at ../../src/mpi/init/init.c:253 #10 0x0000000000400a6e in main () (gdb)
Does anyone have some clue as to what is going on? Thank you very much in advance!
Jenny