Pinning of processes spawned with MPI_Comm

Hi,

the I_MPI_PIN_* variables can be used to set to pretty much any cpu-mask for the MPI-ranks that are used. Unfortunately, the Intel MPI library doesn't set the mask correctly for processes that are dynamically spawned.

Here's an example to show the problem:

program mpispawn
  use mpi
  implicit none

  integer ierr,errcodes(1),intercomm,pcomm,mpisize,dumm,rank
  character(1000) cmd
  logical master

  call MPI_Init(ierr)
  call get_command_argument(0,cmd)
  print*,'cmd=',trim(cmd)
  call MPI_Comm_get_parent(pcomm,ierr)
  if (pcomm.eq.MPI_COMM_NULL) then
    print*,'I am the master. Clone myself!'
    master=.true.
    call MPI_Comm_spawn(cmd,MPI_ARGV_NULL,4,MPI_INFO_NULL,0,MPI_COMM_WORLD,pcomm,errcodes,ierr)
    call MPI_Comm_size(pcomm,mpisize,ierr)
    print*,'Processes in intercommunicator:',mpisize
    dumm=88
    call MPI_Bcast(dumm,1,MPI_INTEGER,MPI_ROOT,pcomm,ierr)
  else
    print*,'I am a clone. Use me'
    master=.false.
    call MPI_Bcast(dumm,1,MPI_INTEGER,0,pcomm,ierr)
  endif
  call MPI_Comm_rank(pcomm,rank,ierr)
  print*,'rank,master,dumm=',rank,master,dumm
  call sleep(300)
  call MPI_Barrier(pcomm,ierr)
  call MPI_Finalize(ierr)
end

I run this example on 2 nodes, each with 2 8-core CPUs. I request core binding and domains with scattered ordering and 3 processes per node (so ranks are bound round-robin to the sockets). mpirun starts 2 MPI-processes, and these spawn 4 further MPI-processes:

[donners@int1 mpispawn]$ I_MPI_DEBUG=4 mpirun -n 2 -hosts "int1,int2" -ppn 3 -binding "pin=yes;cell=core;domain=1;order=scatter" ./mpi.impi
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): shm data transfer mode
[1] MPI startup(): shm data transfer mode
[0] MPI startup(): Rank    Pid      Node name                   Pin cpu
[0] MPI startup(): 0       31036    int1.cartesius.surfsara.nl  {0}
[0] MPI startup(): 1       31037    int1.cartesius.surfsara.nl  {8}
 cmd=./mpi.impi
 I am the master. Clone myself!
 cmd=./mpi.impi
 I am the master. Clone myself!
[1] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[1] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[0] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): reinitialization: shm and dapl data transfer modes
[1] MPI startup(): reinitialization: shm and dapl data transfer modes
[0] MPI startup(): Multi-threaded optimized library
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[1] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[3] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[2] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[1] MPI startup(): shm and dapl data transfer modes
[3] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[3] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[0] MPI startup(): shm and dapl data transfer modes
[2] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[2] MPI startup(): shm and dapl data transfer modes
[0] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
[1] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[1] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
[2] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[2] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
[3] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[3] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
 Processes in intercommunicator:           2
 rank,master,dumm=           1 T          88
 Processes in intercommunicator:           2
 rank,master,dumm=           0 T          88
[0] MPI startup(): Rank    Pid      Node name                   Pin cpu
[0] MPI startup(): 0       31045    int1.cartesius.surfsara.nl  {0}
[0] MPI startup(): 1       24519    int2.cartesius.surfsara.nl  {0}
[0] MPI startup(): 2       24520    int2.cartesius.surfsara.nl  {8}
[0] MPI startup(): 3       24521    int2.cartesius.surfsara.nl  {1}
 cmd=./mpi.impi
 I am a clone. Use me
 rank,master,dumm=           0 F          88
 cmd=./mpi.impi
 I am a clone. Use me
 cmd=./mpi.impi
 I am a clone. Use me
 cmd=./mpi.impi
 I am a clone. Use me
 rank,master,dumm=           1 F          88
 rank,master,dumm=           2 F          88
 rank,master,dumm=           3 F          88

The 2 initial processes are bound correctly, each round-robin to the first core of each sockets on the first node. However, the first dynamically spawned rank is also bound to the first core, but this should have been the second core. Note that the processes that were dynamically spawned, do get distributed correctly across nodes. Also the binding on the second node is correct.

What can be done to bind all dynamically spawned processes correctly?

Thread Topic:

Bug Report

Pinning of processes spawned with MPI_Comm_spawn

Thread Topic:

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List