Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

question about Hybrid MPI/OpenMP

$
0
0

Dear all,

I run the program by the following command:

mpiexec -wdir z:\directional -mapall -hosts 10 n01 5 n02 5 n03 5 n04 5 n05 5 n06 5 n07 5 n08 5 n09 5 n10 5 test

The cluster has 10 nodes with 24 logical cores ( 2*Intel(R) Xeon(R) CPU x5675) on every node. The program test have openMP based parallel calculation in some part, but also a considerable part is not parallelized. However, the problem is that the program 'test' only use 4 cores when running in parallel part (total CPU usage is only 80%), I noticed that when set I_MPI_PIN_DOMAIN=omp, every process 'test' will use all 24 cores. I have tested the program 'test' on one node by

mpiexec -wdir z:\directional -mapall -n 5 test

The program 'test' runs what I wanted (total CPU usage is 100% when in parallel part).

Now the problem is that the first command failed after I set I_MPI_PIN_DOMAIN=omp:

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(658)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(104)..................:
MPID_nem_tcp_post_init(345)..........:
MPID_nem_newtcp_module_connpoll(3102):
gen_read_fail_handler(1196)..........: read from socket failed - The specified network name is no longer available.

What should I do to let the program use 100% CPU on every node?

Thanks,

Zhanghong Tang


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>