Hi,
I am trying to pin 6 ranks on a dual-socket (12 cores per socket) node, with 4 OpenMP threads per MPI rank. I set I_MPI_PIN_DOMAIN=omp:compact, but I get this I_MPI_DEBUG output:
[0] MPI startup(): 0 14563 n2470 {0,1,12,13}
[0] MPI startup(): 1 14564 n2470 {2,3,14,15}
[0] MPI startup(): 2 14565 n2470 {4,5,16,17}
[0] MPI startup(): 3 14566 n2470 {6,7,18,19}
[0] MPI startup(): 4 14567 n2470 {8,9,20,21}
[0] MPI startup(): 5 14568 n2470 {10,11,22,23}
I would have expected
[0] MPI startup(): 0 14563 n2470 {0,1,2,3}
[0] MPI startup(): 1 14564 n2470 {4,5,6,7}
[0] MPI startup(): 2 14565 n2470 {8,9,10,11}
[0] MPI startup(): 3 14566 n2470 {12,13,14,15}
[0] MPI startup(): 4 14567 n2470 {16,17,18,19}
[0] MPI startup(): 5 14568 n2470 {20,21,22,23}
Intel Parallel Studio Cluster edition 2017 update 5. Am I setting something incorrectly, or is this a behavior to protect performance against myself?
Thanks! - Chris