Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

mpirun error running in a cpuset

$
0
0

I've have errors using mpirun whitin a cpuset (regardles if the cset shield is activatet or not)

cset set -lr
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-431 y 0-11 y 4956 2 /
user 24-431 n 1-11 n 0 0 /user
system 0-23 n 0 n 0 0 /system

which mpirun
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/bin/mpirun

cset proc --move -p $$ /
mpirun -np 10 ./wrf.exe #PROPERLY WORKS

cset proc --move -p $$ /system
mpirun -np 10 ./wrf.exe #PROPERLY WORKS

cset proc --move -p $$ /user
mpirun -np 10 ./wrf.exe #ERROR!!!!
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/bin/mpirun: line 103: 343504 Segmentation fault (core dumped) mpiexec.hydra "$@" 0<&0

The error happens also in this way:
cset proc --exec -s /user mpirun -- -np 10 ./wrf.exe

The fact the error happens only in the /user cpuset is quite strange, isn'nt it?
After all cpuset /user doesn't differ much from cpust /system wher mpirun work properly!

The error happens whichever -np is, also without -np flags.

Can anybody help me?
thanks from Italy,

Emanuele Lombardi

ifort (IFORT) 19.1.0.166 20191121
Intel(R) MPI Library for Linux* OS, Version 2019 Update 6 Build 20191024 (id: 082ae5608)
SLES15SP1
HP Superdome Flex (ex SGI UV)

topology
System type: Superdome Flex
System name: tiziano
Serial number: CZ20040JWV
12 Blades
432 CPUs (online: 0-431)
12 Nodes
2230 GB Memory Total
1 Co-processor
2 Fibre Channel Controllers
4 Network Controllers
1 SATA Storage Controller
1 USB Controller
1 VGA GPU
2 RAID Controllers

BTW I had the same error in 2013 as you can see from
https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technolog...


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>