...
Part of this system (MARCONI_Fusion) is reserved for the activity of EUROfusion (https://www.euro-fusion.org/). Details on the MARCONI_Fusion environment are reported in a dedicated document.
Access
All the login nodes have an identical environment and can be reached with SSH (Secure Shell) protocol using the "collective" hostname:
...
The info reported here refer to the general MARCONI partition. The production environment of MARCONI_Fusion is discussed in a separate document.
As usual on systems using SLURM, you can submit a script script.x using the command:
...
The queues serving the Marconi FUSION partition allow instead the use of nodes in flat/quadrant or cache/quadrant modes, please refer to the dedicated document.
The maximum memory which can be requested is 90 GB for cache nodes. However, to avoid memory swapping to disk with the associated performance degradation we strongly suggest to use up to 86 GB for cache nodes.
...
srun ./myexecutable --mpi=pmi2
This will enqueue the job on the queue.
As already mentioned, if the "env-skl" module is loaded, the command "qstat" will list all the jobs submitted on the A3 partition. Analogously, to have the list of jobs submitted by a specific user on the SKL nodes the usual flag "-u" will provide the information:
(SKL) [username@r000u07l02 ~]$ qstat -u $USER -w
The complete list of the status of user's job is visualized with the full jobid. To obtain a full display job status, given a job_id reported by the qstat command, you can use the "-f" flag
(SKL) [username@r000u07l02 ~]$ qstat -f <job_id>
and to delete a job, you need to type:
(SKL) [username@r000u07l02 ~]$ qdel <job_id>
For the previous commands to properly work, please note that differently from the A1 partition (where it suffices the job number to identify a job), and analogously to A2, on A3 the full job id is required as reported by the command "qstat -u $USER -w" (i.e. <job_number>.<PBS A3 server>, for instance 37239.r000u26s04).
By unloading the env-skl module you will restore the default PBS configuration (and the default prompt); all PBS commands will hence refer to the server installed on the Broadwell (A1) partition.
The loading of the module is the recommended option for submitting jobs to the KNL partition, but an alternative method is provided for those users who frequently use more partitions (even though we suggest to start two separate shells to deal with the two production environments). When logging to Marconi's login nodes, the default PBS environment refers to the Broadwell partition (hence, all the PBS commands will refer to queues and jobs submitted on the Brodwell nodes). If you don't load the env-skl module, you can always query the A3 PBS server (and submit jobs to the A3 nodes) by explicitly providing the server to be queried. Two aliases have been defined to identify the A3 PBS servers: the primary server skl1 and the secondary server skl2. Hence, with the command:
username@r000u07l02 ~] qstat -q
you will see the list of A1 queues, while the command:
[username@r000u07l02 ~] qstat -q @skl1
or
[username@r000u07l02 ~] qstat -q @skl2
will report the list of queues defined on the A3 partition.
If you load the env-skl module, the configuration file will automatically allow to select the active server, while if you want to explicitly query the A3 PBS servers you query the primary (skl1) server first and, if the connection is closed, query the secondary one (skl2). The same applies for all the other uses of the "qstat" command, for example: to see all the jobs submitted by a user the command:
[username@r000u07l02 ~] qstat -u $USER
will provide the list of jobs submitted to the Broadwell partition, while the command:
[username@r000u07l02 ~] qstat -u $USER @skl1
will provide the list of jobs submitted to the SKL partition.
To delete an A3 job from A1 or A2 partition:
[username@r000u07l02 ~] qdel <jobid>@skl1
Be sure that there are no spaces between the jobid and "@skl1". Also make sure that you use the full jobid that is returned from "qstat -u $USER -w", because without the -w option the jobid might be truncated.
Finally, you can change the partition (from A3 partition to A1 or A2 partition) loading env-bdw or env-knl module, which will also result in the change of the prompt by putting (BDW) or (KNL) in front of the user-defined or default prompt. Otherwise you can unload the env-skl module and restore in this way the original prompt.
Summary
In the following table you can find all the main features and limits imposed on the queues of the shared A1 and A2 partitions. For Marconi-FUSION dedicated queues please refer to the dedicated document.
Queue name | Partition | # cores per job | max walltime | max running jobs per user/ max n. of cpus per user | max memory per job | priority | HBM/clustering mode | notes | |
A1 | debug | min = 1 max = 144 | 02:00:00 | 4/144 | 123 GB/node value suggested: 118 GB/node | 40 | managed by route runs on 24 nodes shared with the visualrcm queue | ||
route | A1 | prod | min = 1 max = 2304 | 24:00:00 | 20/2304 | 123 GB/node value suggested: 118 GB/node | 50 | managed by route | |
A1 | bigprod | min = 2305 max = 6000 | 24:00:00 | 1/6000 | 123 GB/node value suggested: 118 GB/node | 60 | managed by route | ||
special | A1 | special | min = 1 max = 36 | 180:00:00 | 123 GB/node value suggested: 118 GB/node | 100 | ask superc@cineca.it #PBS -q special | ||
serial | A1 | serial | 1 | 04:00:00 | max 12 jobs on this queue max 4 jobs per user | 1 GB | 30 | #PBS -q serial | |
visualrcm | A1 | visualrcm | min = 1 max = 144 | 03:00:00 | 1/144 | 123 GB/node value suggested: 118 GB/node | 40 | runs on 24 nodes shared with the debug queue | |
knlroute | A2 | knldebug | min = 1 max = 136 (2 nodes) | 00:30:00 | 5/340 | 90 GB/node (mcdram=cache) value suggested: 86 GB/node | 40 | mcdram=cache numa=quadrant | managed by knlroute runs on 144 dedicated nodes |
A2 | knlprod | min >136 max = 68000 (1000 nodes) | 24:00:00 | 20/68000 | 90 GB/node (mcdram=cache) value suggested: 86 GB/node | 50 | mcdram=cache numa=quadrant | managed by knlroute
| |
knltest | A2 | knltest | min =1 max = 952 (14 nodes) | 24:00:00 | - | 90 GB/node (mcdram=cache) value suggested: 86 GB/node 105 GB/node (mcdram=flat) value suggested: 101 GB/node | 30 | mcdram=<cache/flat> numa=quadrant | ask superc@cineca.it #PBS -q knltest #PBS -W group_list=<account_no> |
...