Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Fine-grain time synchronization among HPC nodes

$
0
0

Hi all,

I need to profile an HPC application on multiple nodes with very low overhead impact. In the application code, I need to monitor MPI synchronization points (barrier, alltoall, etc.). I'm using invariant TSC (RDTSC/RDTSCP instruction) because I cannot rely on clock_gettime() due high overheads of syscalls. I knew that TSCs should be synchronized among cores and sockets on the same node, hence I should have no problems for intra-node timing synchronization.
But I have the following concerns:

1) How can I synchronize TSCs among different nodes with a very fine-grain accuracy (sub-microsecond accuracy)? I think that developers of "Intel Trace Analyzer and Collector" should had similar problems.

2) I suppose that TSCs on different nodes increment always at a fixed nominal frequency. Do you think that invariant clock oscillators can have little drifts? I suppose to yes, but in this case for long application runs, profilers on different nodes can produce inconsistent inter-node timing information. Moreover, If TSCs are affected to clock drifts, I cannot transform time stamp in seconds.

My target system is an HPC machine composed to double-socket Broadwell nodes interconneted with an Omni-Path network.

Thanks to all in advance,
Daniele


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>