Hi,
there seems to be a problem with some versions of intelmpi and file access with mpi shared file pointers. The files are not written correctly. We are using intelmpi 2017 on a cluster with gpfs filesystem. The linux kernel version is 3.10.0-327.36.3.el7.x86_64
Here is a code that reproduce the problem.
#include <stdio.h> #include <stdlib.h> #include <mpi.h> int main(int argc, char *argv[]) { char string[256]; char file_name[] = "output"; int count, slength; int open_error; int rank; MPI_File fh; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); sprintf(string,"Rank : %d\n",rank); slength=strlen(string); open_error = MPI_File_open(MPI_COMM_WORLD, file_name, MPI_MODE_CREATE | MPI_MODE_WRONLY, MPI_INFO_NULL, &fh); if(open_error!=MPI_SUCCESS) { fprintf(stderr,"Error opening file\n"); MPI_Abort(MPI_COMM_WORLD,open_error); } MPI_File_write_shared(fh, string, slength, MPI_CHAR, &status); MPI_Get_count(&status,MPI_CHAR,&count); if(slength!=count) { fprintf(stderr,"rank %d: slength=%d , count=%d \n", rank, slength, count); } MPI_File_close(&fh); MPI_Finalize(); return 0; }
One example of output is:
$ mpirun -np 10 ./mpi_shared ; cat output | sort -n -k 3 Rank : 2 Rank : 8 Rank : 9
i.e files are truncated.
Instead with the options:
I_MPI_EXTRA_FILESYSTEM=on I_MPI_EXTRA_FILESYSTEM_LIST=gpfs
the mpi api seems to work, at least with this use case.
I also tested this issue with intelmpi 5.0.3, 5.1.1 and 5.1.3 obtaining the same results as of intel 2017.
It seems that, if the filesystem is not explicitly specified by means of the I_MPI_EXTRA_FILESYSTEM variables the semantic of MPI_File_write_shared is not compliant with the mpi standard.
Is it correct for you ?
Thanks.
Stefano