I am trying to run a Fortran MPI based code (Incompact3d) on a cluster. The code works fine in local execution (i7) or in a single node (dual Xeon) handling even agressive optimization options like -fast. In our cluster, I can only make it work with gcc+openmpi. Intel does not work.
e.g. with gcc 4.6.3 and mpif90 -O3 -funroll-loops -ftree-vectorize -cpp -march=native -g -fbacktrace -ffast-math and mpirun -machinefile nodefile -np 96 ./incompact3d. Works, but is very slow.
Some info about Intel installation and the cluster...
$mpirun --version
Intel(R) MPI Library for Linux* OS, Version 4.1.0 Build 20120831
Copyright (C) 2003-2012, Intel Corporation. All rights reserved.
$mpiifort --version
ifort (IFORT) 13.0.1 20121010
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.
$uname -a
Linux cerrado01n 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
$ofed_info > ofed_info (attached file)
$rpm -qa | grep dapl -> no output
$ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 192379
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 192379
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
$ mpirun -genvall -genv I_MPI_DEBUG 5 -genv I_MPI_HYDRA_DEBUG 1 -genv I_MPI_FABRICS=shm:dapl -machinefile ./nodes -n 96 ./incompact3d > log (attached file)
unexpected disconnect completion event from [27:cerrado02n]
Assertion failed in file ../../dapl_conn_rc.c at line 1128: 0
internal ABORT - process 28
Am I doing something wrong? Have no clue.
Thanks in advance.