Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

MPI_send/recv odd behavior

$
0
0

Hi all,

I am new here but following advise from Intel, I ask my question here.

I use the intel MPI beta version (2017.0.042). I have some codes that run locally. Everything works well. But in at least one case, I get an odd behavior. Inside my code, I do a first send/recv to get the data size and then I send/recv data. Now when the size is small, everything works fine. But when I want to send more than 10.000 doubles, I get an infinite loop. Using GDB in the following way on two MPI_proc, I do a Ctrl+C and looking at the backtrace, I get something so weird. 

mpirun -n 2 xterm -hold -e gdb --args ./foo -m <datafilename>

The sketch is to send from process 1 to process 0, as a small reduction. In that configuration, the destination is 0 and the source is 1. But from the backtrace, this informations are corrupted i.e. source = -1. This explains the infinite loop. Moreover the tag variable, setup to 0, move to another value.

So, my idea is that there might be a bufferoverflow. To be sure, I switch to MPICH 3.2. And now, everything works fine.

 

Finally, following advise of Gergana, I have looked at the troubleshooting and try few ideas. One more time, I got an odd behavior :using an option as follow, it fixes the bug (https://software.intel.com/fr-fr/node/535596)

mpirun -n 2 -genv I_MPI_FABRICS tcp ./foo -m <datafilename>

Well, my question is finally I would like to get some help, some information and/or some explanation about that. Is it bug coming from my usage of I_MPI ?

Thank you in advance for taking time to read me.

Sebastien

PS: additional informations : laptop Asus UX31 with Ubuntu 14.04 LTS and Intel® Core™ i5-2557M CPU @ 1.70GHz × 4 

Rubrique de thread: 

Help Me

Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>