Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

How to use "I_MPI_ADJUST" option efficiently?

$
0
0

Hi All,

I found that I_MPI_ADJUST option can be used to test the performance improvement by changing the algorithm of MPI communication.
I also learned that there is an AUTOTUNE function, so you can test multiple options at once. I would like to take advantage of this feature to find the best option in my application.
However, the application takes a long time and there are many I_MPI_ADJUST options, so it is necessary to exclude unnecessary experiments and prioritize experiments.

Before constructing an experiment with a specific application, I ask some questions to hear from experts.

Q1. The autotuning page says that the Intel MPI performance depends on the platform.
   Can it also depend on the volume of trasfer/number of calls/number of nodes used in the application?
   https://software.intel.com/en-us/node/810193

Q2. On the I_MPI_ADJUST page, I_MPI_ADJUST defaults to 0 and says "The default value of zero selects the optimized default settings".
    Does this mean one of the options from 1 to N? Or is it an Intel's secret recipe?
  software.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-windows/top/environment-variable-reference/i-mpi-adjust...

Q3. Is there I_MPI_ADJUST for Waitall? There is about MPI_Barrier. If you apply this, does it affect Waitall?

Q4. Is there any possibility that the optimal I_MPI_ADJUST option for the application is different for other Intel MPI versions?

And I want to test it with the following process.
If there are any unnecessary process or additional process, please comment.

P1. Check the MPI functions total time summary through the APS report and list up the top 5 functions in the order of elapsed time.

P2. List the top 5 options that works well on the current platform by autotuning 5 functions.

P3. Perform the application in 25 cases (5 functions*5 options) to compare the baseline (without all options) to check performance improvement.

P4. Check the baseline and performance improvement by performing the final case that combines the optimal options for each function.

Thank you for reading the long question.
If you have any comments, please share us.

Best Regards,
Kihang


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>