Hi all,
I am trying to collect tracing info for my Intel MPI job. For relatively small number of processes (around 300) the run hangs or I receive the following error message:
UCM connect: REQ RETRIES EXHAUSTED: 0x570 32c43 0xed -> 0x544 3f3a4 0xbbdd
How can I debug this error?
Best,
Igor