Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Using MPI in parallel OpenMP regions

$
0
0

Hi all,

I am trying to call MPI from within OpenMP regions, but I cannot have it working properly; my program compiles OK using mpiicc (4.1.1.036) and icc (13.1.2 20130514). I checked that it was linked against thread-safe libraries (libmpi_mt.so appears when I run ldd).

But when I try to run it (2 Ivybridge nodes x 2 MPI tasks x 12 OpenMP threads), I get a SIGSEGV without any backtrace :

/opt/softs/intel/impi/4.1.1.036/intel64/bin/mpirun -np 4 -ppn 2 ./mpitest.x

APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

Or with debug level set to 5 :

/opt/softs/intel/impi/4.1.1.036/intel64/bin/mpirun -np 4 -ppn 2 ./mpitest.x
[1] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[0] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): shm and dapl data transfer modes
[2] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[3] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1
[2] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[2] MPI startup(): shm and dapl data transfer modes
[3] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[3] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): Rank    Pid      Node name   Pin cpu
[0] MPI startup(): 0       90871    beaufix522  {0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35}
[0] MPI startup(): 1       90872    beaufix522  {12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47}
[0] MPI startup(): 2       37690    beaufix523  {0,1,2,3,4,5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35}
[0] MPI startup(): 3       37691    beaufix523  {12,13,14,15,16,17,18,19,20,21,22,23,36,37,38,39,40,41,42,43,44,45,46,47}
[0] MPI startup(): I_MPI_DEBUG=5
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_DIST=10,15,15,10
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=mlx4_0:0
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] MPI startup(): I_MPI_PIN_MAPPING=2:0 0,1 12
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

Of course, if I use a single OpenMP thread, everything works fine. I also tried to wrap calls to MPI into critical regions, which works, but is not what I want.

My program is just a small test case to figure out whether I can try this pattern inside a bigger program. For each MPI task, all OpenMP threads are used to send messages to other tasks, and afterwards, all OpenMP threads are used to receive messages from other tasks.

My questions are :

  • does my program conforms to the thread level MPI_THREAD_MULTIPLE (which btw is returned by MPI_Init_thread) ?
  • is IntelMPI supposed to run it correctly ?
  • if not, will it work someday ?
  • what can I do now (extra tests, etc...) ?

Best regards,

Philippe

Fichier attachéTaille
Téléchargementmpitest.c2.22 Ko

Viewing all articles
Browse latest Browse all 927

Trending Articles