Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

MPI tasks

By default SLURM uses the auto-affinity module to bind MPI tasks to cpus. In some cases, you may want to modify this default, in order to ensure optimal performances.

...

--cpus-per-task=floor (n° physical cpus per node/ n° tasks per node) * n° hyper threads

...

2) On "hyperthreading nodes" if you require for a single node more tasks of available physical cpus  you have to specify the following SLURM option:

--cpu-bind=threads

in order to ensure the binding between mpi tasks and logical cpus and to avoid the overload of physical cpus.

 


Alternatively, you can request more cpus for the single task until you use all logical cpus of the node :

--cpus-per-task=<n° logical cpus per node / n° tasks per node>

...

3) If you require for a single node a number of tasks that are equal to a number of physical cpus or number of logical cpus sthere is no need for adding --cpu-bind or --cpus-per-task slurm options. Each task is assigned a CPU in sequential order.

OpenMP threads

For OpenMP codes you have to make explicit the number of cpus to allocate for each single task to be used from OpenMP threads. In order to do it you can use the following slurm option:

...

On nodes without hyperthreading the cpu concept coincides with physical cpu (core) and consequently the n° of cpus for single task (--cpus-per-task) can be up the maximum number of physical cpus of the node. For example, on BDW and  SKL nodes the n° of cpus for single task can be up to 36/ 48 cpus respectively.


On  hyperthreading nodes, it coincides with logical cpu (thread) and consequently, the n° of cpus for a single task can be up to the maximum number of logical cpus of the node.

In order to define if the OpenMP threads have to bind physical ( core) or logical cpus (thread) you can use the following variable:

export OMP_PLACES= <cores|threads>

...

export OMP_PLACES= <cores>

...

export OMP_PLACES= <threads>

...