Real-Time Performance Monitoring and Faster Debugging with NCCL Inspector and Prometheus | NVIDIA Technical Blog
… The following is an example: nccl p2p bus bandwidth gbs{version="v5.1",slurm job id="1670760",node="nvl72033-T01",gpu="GPU0",comm name="unknown",n nodes="1",nranks="64",p2p operation="Send",message size="1-2MB"} 19.1634 nccl p2p exec time microseconds{version="v5.1",slurm job id="1670760",node="nvl… …