Debugging problems with mpiexec.hydra

Hello,

I have provided an ifort + intelMPI build of a CFD solver to a customer using the Intel MPI runtime environment. The customer's cluster uses a SGE scheduler. The user is attempting to run on a single node with mpirun using the default Hydra process manager, but the job hangs without any meaningful error messages. I instructed the user to run with I_MPI_HYDRA_DEBUG=1 and I_MPI_DEBUG=6 but I don't see anything unusual or unexpected in the output. The last lines in the output are the following:

[mpiexec@compute-31] Launch arguments: /cm/shared/apps/sge/current/bin/linux-x64/qrsh -inherit -V compute-31.cm.cluster /a/fine/10.2/fine102/LINUX/_mpi/_impi5.0.3/intel64/bin/pmi_proxy --control-port compute-31.cm.cluster:46905 --debug --pmi-connect lazy-cache --pmi-aggregate -s 0 --rmk user --launcher sge --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 1687715305 --usize -2 --proxy-id 0
[mpiexec@compute-31] STDIN will be redirected to 1 fd(s): 9

I ran a similar test on another cluster which worked correctly. The following lines with the [proxy:0:0@nmc-0066] prefix appear directly after the last lines reported by the customer cluster.

[mpiexec@nmc-0066] STDIN will be redirected to 1 fd(s): 9
[proxy:0:0@nmc-0066] Start PMI_proxy 0
[proxy:0:0@nmc-0066] STDIN will be redirected to 1 fd(s): 9

From these tests I think the hang is coming from pmi_proxy. Are there any additional verbose or debugging modes that could help identify the problem on the user's cluster?

Thank your for your help,

-David

Debugging problems with mpiexec.hydra

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112