Page History

...

Each KNL node exposes itself to SLURM as having 68 cores (corresponding to the physical cores of the KNL processor). Jobs should request the entire node (hence, #SBATCH -n 68), and the KNL SLURM server is configured so that to assign the KNL nodes in an exclusive way (even if less ncpus are asked). Hyper-threading is enabled, hence you can run up to 272 processes/threads on each assigned node.

The preliminar configuration of the Marconi-A2 partition allowed to explore require different HBM modes (on-package high-bandwidth memory based on the multi-channel dynamic random access memory (MCDRAM) technology) and clustering modes of cache operations. :

#SBATCH --constraints=flat/cache

Please refer to the official Intel documentation for a description of the different modes. Following the suggestions of the Intel experts, we finally adopted one configuration only for all the KNL racks serving the knldebug and knlprod (academic) queues, namely:

cache/quadrant

Fot the queues The queues serving the Marconi FUSION partition allow instead the use of nodes in flat/quadrant or cache/quadrant modes, please refer to the dedicated document.

The maximum memory which can be requested is 86000MB for cache nodes. requested is 86000MB for cache/flat nodes:

#SBATCH --mem=86000

For flat nodes the jobs can require the KNL high bandwidth memory (HBM) using Slurm's Generic RESource (GRES) options:

#SBATCH --constraints=flat

#SBATCH --mem=86000

#SBATCH --gres=hbm:16g

For example, to request a single KNL node in a production queue the following SLURM job script can be used:

...

MARCONI Partition	SLURM partition	QOS	# cores per job	max walltime	max running jobs per user/ max n. of cpus per user	max memory per node (MB)	priority	HBM/clustering mode	notes
front-end	bdw_all_serial (default partition)	bdw_all_serial	1	04:00:00	Max 12 running jobs Max 4 jobs/user	3000
A1	bdw_all_rcm	bdw_all_rcm	min = 1 max = 144	03:00:00	1/144	118000			runs on 24 nodes shared with the debug queue

A1	bdw_usr_dbg	bdw_usr_dbg	min = 1 max = 144	02:00:00	4/144	118000			managed by route runs on 24 nodes shared with the visualrcm queue
A1	bdw_usr_prod	bdw_usr_prodnoQOS	min = 1 max = 2304	24:00:00	20/2304	118000
		bdw_qos_bprod	min = 2305 max = 6000	24:00:00	1/6000	118000			#SBATCH -p bdw_usr_prod #SBATCH --qos=bdw_qos_bprod
		bdw_qos_special	min = 1 max = 36	180:00:00		118000			ask superc@cineca.it #SBATCH -p bdw_usr_prod #SBATCH --qos=bdw_qos_special

A2	knl_usr_dbg	knl_usr_dbg	min = 1 max = 136 (2 nodes)	00:30:00	5/340	86000 (mcdram=cache/flat) For flat nodes you can require up to 16g of hbm		mcdram=cache numa=quadrant	runs on 144 dedicated nodes
A2	knl_usr_prodknl_usr_prod	no QOS	min = 1 max = 13260 (195 nodes)	24:00:00	20/68000	86000 (mcdram=cache/flat)		For flat nodes you can require up to 16g of hbm		mcdram=cache numa=quadrant
		knl_qos_bprod	min > 13260 max = 68000 (1000 nodes)	24:00:00	Max 1 jobs/user Max 2 jobs/account	86000 (mcdram=cache/flat) For flat nodes you can require up to 16g of hbm		mcdram=cache numa=quadrant	ask superc@cineca.it #SBATCH -p knl_usr_prod #SBATCH --qos=knl_qos_bprod

...

Page tree

Versions Compared

Old Version 31

New Version 32

Key