Hi,
I recently tried the APS tool to capture the message details (size and amount) for the WRF application on the intel 8280 (opa).
We launch 1 process per core, and here is the data -
nodes,Message_size(B),Volume(MB),Volume(%),Transfers,Time(sec),Time(%)
1,0,0,0,58099903,3988.14,97.05
2,0,0,0,219491539,7554.45,96.19
4,0,0,0,850730419,15073.44,96.02
It seems that significant amount of time is spent in the transfer of these 0 byte messages and with more number of nodes, the amount of messages increases. Could you please help me in understanding following-
Q: The significance of these 0 bytes messages and How are they related to MPI communication protocol?
I guess aps collects messages transferred between all processes (inter node + intra node), so
Q:Is there a way to check (from aps) that how much of these messages were transmitted to the network? (inter node messages - for 2 and 4 node runs)