Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

from the output of the comman it is possible to see that GPU0 and GPU1 are connected with the NVLink, as well as the couple GPU2 & GPU3. The first couple is connected to cpus 0-63, the second to cpus 64-127. The cpus are numbered from 0 to 127 becauso because of a hyperthreading of four: 32 physical core x 4 → 127 cpus128 cpus.

The knowlwdge of the topology of the node is important for correctly distribute the parallel threads of your applications in order to get the best performances.


The internode communications is based on a Mellanox Infiniband EDR network, and the OpenMPI and IBM MPI Spectrum libraries are configured so to exploit the Mellanox Fabric Collective Accelerators (also on CUDA memories) and Messaging Accelerators.

...