Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Each KNL node exposes itself to SLURM as having 68 cores (corresponding to the physical cores of the KNL processor). Jobs should request the entire node (hence, #SBATCH -n 68), and the KNL SLURM server is configured so that to assign the KNL nodes in an exclusive way (even if less ncpus are asked). Hyper-threading is enabled, hence you can run up to 272 processes/threads on each assigned node.

The preliminar configuration of the Marconi-A2 partition allowed to explore require different HBM modes (on-package high-bandwidth memory based on the multi-channel dynamic random access memory (MCDRAM) technology) and clustering modes of cache operations. :

#SBATCH --constraints=flat/cache

Please refer to the official Intel documentation for a description of the different modes. Following the suggestions of the Intel experts, we finally adopted one configuration only for all the KNL racks serving the knldebug and knlprod (academic) queues, namely:

  • cache/quadrant

Fot the queues The queues serving the Marconi FUSION partition allow instead the use of nodes in flat/quadrant or cache/quadrant modes, please refer to the dedicated document.

The maximum memory which can be requested is 86000MB for cache nodes. requested is 86000MB for cache/flat nodes:

#SBATCH --mem=86000

For flat nodes the jobs can require the KNL high bandwidth memory (HBM)  using Slurm's Generic RESource (GRES) options:

#SBATCH --constraints=flat
#SBATCH --mem=86000

#SBATCH --gres=hbm:16g

For example, to request a single KNL node in a production queue the following SLURM job script can be used:

...

MARCONI

Partition

SLURM

partition

QOS# cores per job
max walltime

max running jobs per user/

max n. of cpus per user

max memory per node

(MB)

priorityHBM/clustering modenotes

front-end

bdw_all_serial

(default partition)

bdw_all_serial104:00:00

Max 12 running jobs

Max 4 jobs/user

 3000   
A1bdw_all_rcmbdw_all_rcm

min = 1

max = 144

03:00:001/144

118000


   

runs on 24 nodes shared with

the debug queue

          
A1bdw_usr_dbgbdw_usr_dbg

min = 1

max = 144

02:00:004/144

 

118000

 


  

managed by route

runs on 24 nodes shared with

the visualrcm queue

A1bdw_usr_prodbdw_usr_prodnoQOS

min = 1

max = 2304

24:00:0020/2304

118000


   
  bdw_qos_bprod

min = 2305

max = 6000

24:00:00

1/6000

118000


  

#SBATCH -p bdw_usr_prod

#SBATCH --qos=bdw_qos_bprod

  bdw_qos_special

min = 1

max = 36

180:00:00 

118000


  

ask superc@cineca.it

#SBATCH -p bdw_usr_prod

#SBATCH --qos=bdw_qos_special

 

          
A2 knl_usr_dbgknl_usr_dbg

min = 1

max = 136 (2 nodes)

00:30:005/340

86000 (mcdram=cache/flat)

For flat nodes you can require up to 16g of hbm

 

mcdram=cache

numa=quadrant 

runs on 144 dedicated nodes

A2knl_usr_prodknl_usr_prodno QOS

min = 1

max = 13260 (195 nodes)

24:00:0020/68000

86000 (mcdram=cache/flat)

 

For flat nodes you can require up to 16g of hbm

  

mcdram=cache

numa=quadrant

 

  knl_qos_bprod

min > 13260

max = 68000 (1000 nodes)

24:00:00

Max 1 jobs/user

Max 2 jobs/account

86000 (mcdram=cache/flat)

For flat nodes you can require up to 16g of hbm

 

mcdram=cache

numa=quadrant 

ask superc@cineca.it

#SBATCH -p knl_usr_prod

#SBATCH --qos=knl_qos_bprod

 

...