...
In the following table you can find the main features and limits imposed on the partitions of M100.
SLURM partition | Job QOS | # cores/# GPU per job | max walltime | max running jobs per user/ max n. of cpus/nodes/GPUs per user | priority | notes |
m100_all_serial (def. partition) | normal | max = 1 core, max mem= 7600MB | 04:00:00 | 4 cpus/1 GPU | 40 | |
m100_usr_prod | normal | max = 16 nodes | 24:00:00 | 40 | runs on 880 nodes | |
m100_qos_dbg | max = 2 nodes | 02:00:00 | 2 nodes/64cpus/8GPUs | 45 | runs on 12 |
nodes | ||||||
m100_qos_bprod | min = 17 nodes max =256 nodes | 24:00:00 | 256 nodes | 85 | runs on 512 nodes | |
m100_usr_preempt | normal | max = 16 nodes | 24:00:00 | 1 | runs on 99 nodes | |
m100_fua_prod (EUROFUSION) |
normal | max = |
16 nodes |
24:00:00 |
40 | runs on |
68 nodes | |
m100_ |
(EUROFUSION)
normalqos_fuadbg | max = |
2 nodes |
02:00:00 |
45 | runs on |
12 nodes | ||||||
m100_qos_fuabprod | max = 32 nodes | 24:00:00 | 40 | run on 64 nodes at same time | ||
all partitions | qos_special | > 32 nodes | > 24:00:00 | 40 | request to superc@cineca.it | |
all partitions | qos_lowprio | max = 16 nodes | 24:00:00 | 0 | active projects with exhausted budget |
The partition m100_usr_preempt allows users to access the additional nodes of m100_fua_prod partition in preemptable modality (if available and not used by Eurofusion community). The jobs submitted to the m100_usr_preempt partition may be killed if the assigned resources are requested by jobs submitted to higher priority partition (m100_fua_prod); hence we recommend its use only with restartable applications.
...