pmi_proxy stalls the HPC job

Hi HPC enthusiasts,
We are having a Sandy Bridge cluster of 8 nodes having the following:

Hardware:
1U rackmount enclosure
Intel S2400SC2 board
2 x Xeon E5-2450 processor
96GB ECC DDR3 RDIMM
Intel True Scale QLE7340-CK HCA
500GB Enterprise SATA
36 port QLogic switch
24-port 1GbE switch

Software:
CentOS 6.2 x64
Intel MPI Library 4.1.1.036
Intel Fortran Composer XE 2013.3.163
NetCDF 4.0
FFTW 3.3.3
Open Grid Engine 2011.11.p1
NFS share
Passphraseless SSH from any machine to any machine (meshed)

Of late, whenever we submit the job (home-grown code) either via mpirun direct or through Grid Engine qsub, invariably (~90% times) the job does not start execution, it just appears to stay stalled. On inspection of process runs, we find that randomly few nodes shows 'pmi_proxy' with status 'D' (uninterruptible sleep).

We have tested IMB (Intel MPI Benchmark), test codes (that comes with Grid Engine and Intel MPI) on the cluster both via mpirun and also through qsub, and it functions fine.

What is pmi_proxy process, and how to eliminate stalling of job. Non-functioning of job is driving me crazy. Please excuse me if it is already discussed somewhere, or, if this is not the correct forum. I'm a new novice HPC user.

Any guidance would be appreciated.

My advance thanks for an early and valuable suggestion(s).

With regards
Girish Nair
+91 98457 36460
girishnairisonline <at> gmail <dot> com

pmi_proxy stalls the HPC job

Trending Articles

Arrest logs for Wednesday, March 20, 2019

Practice Sheet of Right form of verbs for HSC Students

Moondru Mudichu 02-03-2017 – Polimer tv Serial

Camila Cabello – C,XOXO (Magic City Edition) [iTunes Plus M4A + M4V]

Uline Warehouse Associate Interview

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Could Not Find the Application that Created this file

Edna Murto, 90, longtime resident of Ely, dies

EXERCISE

Download: Bicko Bicko ft Rich Bizzy & Crew G- Wanfulanganya (Prod by: Bicko...

Mp3 Download: Mandoza - Godoba

O'CONNELL MICHAEL F. 11/29/197...

Not right!

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

Adobe Master Collection 2025 RUS-ENG v7-m0nkrus

Maureen Rose Gradvohl, 67

[LATEST][RECOVERY][UNOFFICIAL]TWRP 3.7.0_12-v2 for Moto G Stylus 5G...

ZARIA CUMMINGS

[GET] Jack Griffin-Parry – The Clothing Brand Blueprint ($150.00)