My application performs one-to-one one-sided communications (every machine has active communications with all the other machines); Intel MPI with Mellanox Infiniband.
I am observing performance bottlenecks in network bandwidth, and concerning moving some parts of communications to collective calls if I can reduce bandwidth usage.
After looking at some documents describing about some algorithms for collective calls, the total bandwidth usage looks all the same (They can reduce the latency, I guess)
Is there any benefit from using collective calls instead of one-sided call w.r.t. their total network bandwidth consumptions?