I have a 6-node cluster consisting of 12 cores per node with a total of 72 cores.
When running the HPCC benchmark on 6 cores - 1 core per node, 6 nodes - HPL results is 1198.87 GFLOPS. However, running HPCC on all available cores of the 6-node cluster, for a total of 72 cores, HPL results is 847.421 GFLOPS.
MPI Library Used: Intel(R) MPI Library for Linux* OS, Version 2018 Update 1 Build 20171011 (id: 17941)
Options to mpiexec.hydra:
-print-rank-map
-pmi-noaggregate
-nolocal
-genvall
-genv I_MPI_DEBUG 5
-genv I_MPI_HYDRA_IFACE ens2f0
-genv I_MPI_FABRICS shm:tcp
-n 72
-ppn 12
-ilp64
--hostname filename
Any ideas?
Thanks in advance.