Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Intel MPI, DAPL and libdaplomcm

$
0
0

Recently, we upgraded our system and have installed Mellanox OFED 2.2-1 in order to support native MPI calls between Xeon Phis.  Our system is a mixture of non-Phi nodes and Phi nodes.

In the course of the upgrade, it seems that something has changed with regard to how Intel MPI (v4.1.3) determines which DAPL provider to utilize for jobs that do not specify a fabric, fabric list or provider.  And even when the DAPL fabric is chosen (I_MPI_FABRICS = shm:dapl), we're getting a message if a specific provider isn't selected.  We do not set any default fabrics or providers via our modules.

The message is:  DAT: library load failure: libdaplomcm.so.2: cannot open shared object file: No such file or directory

This is occurring on non-Phi nodes.  MPSS is only installed on Phi nodes, and thus, libdaplomcm is only on Phi nodes.

According to the Intel MPI reference manual, IMPI will choose the first DAPL provider it finds, but the providers that involve libdaplomcm are all lower in the /etc/dat.conf file than the libdaploucm and libdaploscm providers, which are providers that we know work and are available in /usr/lib64.

Why is Intel MPI trying to utilize a provider that is listed below other providers?  What changed to make it attempt to use libdaplomcm and not the other providers that are available?

Anyone else seen something like this?


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>