Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A parallel program can be executed interactively only within an "Interactive" PBS SLURM batch job, using the "-I" (capital i) optionsrun" command: the job is queued and scheduled as any PBS batch other job, but when executed, the standard input, output, and error streams are connected to the terminal session from which qsub srun was submittedlaunched.

For example, for starting to start an interactive session with the MPI program myprogram, using 2 one node, two processors, enter launch the commandscommand:

> qsubsrun -A <account_no> -I -l select=1:ncpus=2:mpiprocs=2 --N1 -n2 --ntasks-per-node=2 -A <account_name> --pty /bin/bash
qsub: waiting for job ... to start
qsub: job ... ready

SLURM will then schedule your job to start, and your shell will be unresponsive until free resources are allocated for you.

When the shell come back with the prompt, you can execute your program by typing:

> mpirun ./myprogram> ^D

If you want to export variables to the interactive session, use the -v option. For example, if "myprogram" is not compiled statically, you have to define and export the LD_LIBRARY_PATH variable:

> export LD_LIBRARY_PATH= ...

> qsub -I -v LD_LIBRARY_PATH ...

SLURM automatically exports the environment variables you defined in the source shell, so that if you need to run your program myprogram in a controlled environment (i.e. specific library paths or options), you can prepare the environment in the origin shell being sure to find it in the interactive shell SLURruns for you on allocated resources.

Batch

The info reported here refer to the general MARCONI partition. The production environment of MARCONI_Fusion is discussed in a separate document.

As usual on systems with the PBS schedulerusing SLURM, you can submit a script script with .x using the command:

> qsubsbatch script.x

You can get a list of defined queues partitions with the command:

> qstatsinfo -Qa

For more information and examples of job scripts, see section Batch Scheduler PBS.

Submitting serial Batch jobs

The serial queue partition is available with one core and a maximum walltime of 4 hours. It runs on the login nodes and it is designed for pre/post-processing serial analysis, and for moving your data (via rsync, scp etc.) in case more than 10 minutes are required to complete the data transfer. In order to use this queue partition you have to specify the PBS SLURM flag "-qP":

#PBS#SBATCH -qp serial

Submitting Batch jobs for A1 partition

On MARCONI it is possible to submit jobs of different types, using a "routing" queuepartition: just declare how many resources you need and your job will be directed into the right production queue partition (debugbdwdebug, prod bdwprod or bigprod bdwbigprod) with a correct priority. Furthermore, there are two additional queues partitions not managed by the default routing queue, devoted to specific categories of jobs: the serial queue partition and the special queue partition.

The maximum number of cores that you can request is 6000 (about 167 nodes) with a maximum walltime of 24 hours:

  • If you do not specify the walltime (by means of the #PBS #SBATCH --l time directive), a default value of 30 minutes will be assumed.
  • If you do not specify the number of cores (by means of the "ncpus" resource in the #PBS -l select SBATCH -n" directive) a default value of 1 36 will be assumed.
  • If you do not specify the amount of memory (as the value of the "SBATCH --mem" resource in the #PBS -l select directiveDIRECTIVE), a default value of 3GB will be assumed.
  • Even though you can ask up to 123GB, we strongly suggest to limit the amount of the requested memory per node to 118 GB to avoid memory swapping to disk with serious performance degradation.

The special queue partition is designed for not-ordinary types of jobs, and users need to be enabled in order to use it. Please write to superc@cineca.it in case you think you need to use it.

...

Marconi-A2 production environment is based on the latest release of PBS scheduler/resource manager.

After the first month of testing/pre-production phase, we introduced a simplified and more robust way to manage the job submission to the KNL (A2) partition. Our first configuration relied on the use of routing queues defined on the A1 PBS server, which however suffers of two major drawbacks concerning chained and interactive jobs.

The new configuration can take advantage of the definition of a PBS variable, which is automatically set when loading the "env-knl" module, available in the "base" profile. The loading of the module also modifies the (user-defined or default) prompt on the login nodes by putting the (KNL) string in front of it:

[username@r000u07l02 ~]$ module load env-knl
(KNL) [username@r000u07l02 ~]$

to remind users that they are using the production environment serving the A2 partition. In fact, once loaded the module, the PBS environment will be set accordingly to the PBS server defined on the A2 partition, and the jobs will be directly submitted to the A2 queues instead of passing through the routing queues defined on the A1 partition. With respect to the previous configuration. submission process results simplified. You do not more need to load the specific "env-knl" module to submit jobs on partitions based on Knights Landing processors. Instad, you simply need to specify the correct partition using the "SBATCH -p directive, and choosing from the following list:

knldebug

knlprod

knlbigprod

knlspecial

the pre-production phase, you simply need to specify the correct queue in the #PBS -q <queue> directive, in case of special queues, or no queue for the default one (replacing the A1 routing queues "knl" and "knltest"). The loading of this environment directly exposes the login nodes to the KNL production partition; for instance, you will only see the queues defined on the KNL nodes with the "qstat -q" command, and the usual command "qsub", "qstat", "qdel" will act on the A2 PBS queues. Please note that differently from the A1 partition (where it suffices the job number to identify a job), and analogously to A3, on A2 the full job id is required (i.e. <job_number>.<PBS A2 server>, for instance 382113.r064u06s01).

Like in the A1 production environment, the A2-queues knldebug and knlprod, are not directly accessible and are served by the default routing queue "knlroute". This queue does not need to be requested, it being the default queue on the A2 PBS environment, and it will route the jobs either to the knldebug or to the knlprod (depending on the number of requested nodes and the walltime). The knlroute queue accepts jobs for the "shared" A2 partition, while a specific (dedicated) queue, xfuaknlprod, needs to be used for the MARCONI-Fusion KNL nodes.

Each KNL node exposes itself to PBS SLURM has having 68 cores (corresponding to the physical cores of the KNL processor). Jobs should request the entire node (hence, ncpus=#SBATCH -n 68), and the KNL PBS server is configured so that to assign the KNL nodes in an exclusive way (even if less ncpus are asked). Hyper-threading is enabled, hence you can run up to 272 processes/threads on each assigned node.

...