Link to the new User Guide https://docs.hpc.cineca.it/index.html


Data Storage architecture

All HPC systems share the same logical disk structure and file systems definition.

The available storage areas can have multiple definitions/purposes:

  • temporary (data are accessible for a defined time window before to be deleted);
  • permanent (data are accessible up to six months after the end of the project);

or:

  • user specific (each username has a different data area);
  • project specific (accessible by all users linked to the same project).

or:

  • local (specific for each system);
  • shared (the same area can be accessed by all HPC systems)

The available data areas are defined, on all HPC clusters,  through predefined environment variables. You can access on these areas simply using those names:

cd $HOME
cd $SCRATCH
cd $WORK
cd $DRES
cd $FAST (Leonardo Only)
cd $PUBLIC (Leonardo Only)

Suggestion

You are strongly encouraged to use these environment variables instead of full paths to refer to your scripts and codes data.

Overview of Available Data Areas


NameArea AttributesQuotaBackupNoteTypical Usage
$HOMElocal, permanent, user sepcific, backed50 GBdaily-Data are critical, not so large, I want to be sure to preserve them.
$WORK

local, permanent, project specific

1 TB--Large data to be shared with collaborators of my project.
$FAST

loca, permanent, project specific

1 TB-Available only on LeonardoFaster I/O compared with other aresa.
$SCRATCH local,  temporary, user specific

- / 20 TB

Temporary
(files are deleted after 40 days)

On Marconi the same variable is named $CINECA_SCRATCHLarge temporary data.
$TMPDIRlocal, temporary, user specific

-

-

Directory removed at job completion-
$PUBLICpermanent, user specific, shared

50 GB

-

Available only on LeonardoData to be shared with other users, not necessarily participating in common projects.
$DRES

permanent, shared

defined by project

--
  • Data to be maintained even beyond the end of the project. I'll use the data on different CINECA hosts
  • Data to be shared among different platforms

* All the filesystems are based on Lustre.


Ethical Use of the SCRATCH Area 

Users are encouraged to respect the intended use of the various areas. Users are reminded that the SCRATCH area is not subject to restrictions (quota) to facilitate the production of data, even large amounts. However, the SCRATCH area should not be used as a temporary storage area. Users are warned against using “touch” commands or similar methods to extend the retention of files beyond the 40-day limit. The use of such “improper” procedures will be monitored, and users will be subject to various degrees of restrictions up to a ban.

Description  of Data Areas


$HOME: permanent/backed up, user specific, local

$HOME is a local area where you are placed after the login procedure. It is where system, and user applications store their dot-files and dot-directories (.nwchemrc, .ssh, ...) and where users keep initialization files specific for the systems (.cshrc, .profile, ...). There is a $HOME area for each username on the machine.

This area is conceived to store programs and small personal data. It has a quota of 50 GB. Files are never deleted from this area. Moreover, they are guaranteed by daily backups: if you delete or accidentally overwrite a file, you can ask our Help Desk (superc@cineca.it) to restore it. A maximum of 3 versions of each file is stored as a backup. The last version of the deleted file is kept for two months, then definitely removed from the backup archive. File retention is related to the life of the username; data are preserved until the username remains active.


$WORK: permanent, project specific, local

$WORK is a scratch area for collaborative work within a given project. File retention is related to the life of the project. Files in $WORK will be conserved up to 6 months after the project end, and then they will be cancelled. Please note that there is no back-up in this area.

This area is conceived for hosting large working data files since it is characterized by the high bandwidth of a parallel file system. It behaves very well when I/O is performed accessing large blocks of data, while it is not well suited for frequent and small I/O operations. This is the main area for maintaining scratch files resulting from batch processing.

There is one $WORK area for each active project on the machine. The default quota is 1 TB per project, but extensions can be considered by the Help Desk (mailto: superc@cineca.it) if motivated. The owner of the main directory is the PI (Principal Investigator) of the project. All collaborators are allowed to read/write in there. Collaborators are advised to create a personal directory in $WORK for storing their personal files. By default, the personal directory will be protected (only the owner can read/write), but protection can be easily modified, for example by allowing write permission to project collaborators through chmod command. This second approach does not affect global files security.

The (chprj - change project) command makes it easier to manage the different WORK areas for different projects, see here for details.


Symmary about $WORK

  • Created when a project is opened.

  • Each project has its own area.

  • All collaborators can write.

  • Data are preserved up to few months after the end of the project.
  • Each user has as many $WORK areas as active projects.

  • By default files are private.
  • The user can change the permission (chmod) and make files visible (R o R/W) to project collaborators.
  • Default quota of 1 TB.
  • No Backup

$FAST: permanent, project specific, local (LEONARDO ONLY)

$FAST is a scratch area for collaborative work within a given project. File retention is related to the life of the project. Files in $FAST will be conserved up to 6 months after the project end, and then they will be cancelled. Please note that there is no back-up in this area.

This area is conceived for hosting working data files whenever the I/O operations constitute the bottleneck for your applications. It behaves well both when I/O is performed accessing large blocks of data, and for frequent and small I/O operations. Due to the limited size of the area, the main space for maintaining the data resulting from batch processing is the corresponding $WORK area.

There is one $FAST area for each active project on the machine. The fixed quota is 1 TB per project, and due to the total dimension of the storage, extensions cannot be considered. The owner of the main directory is the PI (Principal Investigator) of the project. All collaborators are allowed to read/write in there. Collaborators are advised to create a personal directory in $FAST for storing their personal files. By default, the personal directory will be protected (only the owner can read/write), but protection can be easily modified, for example by allowing write permission to project collaborators through chmod command. This second approach does not affect global files security.

The (chprj - change project) command makes it easier to manage the different FAST areas for different projects, see here for details.

$SCRATCH: temporary , user specific, local

This is a local temporary storage conceived for temporary files from batch applications. There are important differences with respect to $WORK area. It is user specific (not project specific). By default, file access is closed to everyone, in case you need less restrictive protections, you can set them with chmod command.

On this area, a periodic cleaning procedure could be applied, with a normal retention time of 40 days: files are daily cancelled by an automatic procedure if not accessed for more than 40 days. In each user's home directory ($HOME) a file lists all deleted files for a given day. 

CLEAN_<yyyymmdd>.log
      <yyyymmdd> = date when files were cancelled

There is one $CINECA_SCRATCH area for each username on the machine.

$CINECA_SCRATCH does not have any disk quota. Please be aware that on Galileo100 and Marconi100 clusters, to prevent a very dangerous filling condition, a 20TB disk quota will be temporarily imposed to all users when the global quota area reaches 88% of occupancy; this disk quota will be removed when the global occupancy lowers back to normal.

To verify if and how the cleaning procedure is active on a given cluster, check the Mott-of-the-Day. 

On Marconi cluster the enviroment variable $SCRATCH is named $CINECA_SCRATCH.


Symmary about $CINECA_SCRATCH

  • Created when username has granted access.
  • Each username has its own area (and only one).
  • A clean-up procedure is active.
  • Files older then 40 days are cancelled daily.
  • No Backup.
  • By default files are public (read only).
  • The user can change the permission (chmod) and make files private.
  • It is not possible to restrict access to the group (all usernames share the same mail unix group).
  • No Quota.

Users are encouraged to respect the intended use of the various areas. Users are reminded that the SCRATCH area is not subject to restrictions (quota) to facilitate the production of data, even large amounts. However, the SCRATCH area should not be used as a temporary storage area. Users are warned against using “touch” commands or similar methods to extend the retention of files beyond the 40-day limit. The use of such “improper” procedures will be monitored, and users will be subject to various degrees of restrictions up to a ban.

$TMPDIR: temporary, user specific, local

Each compute node is equipped with a local area whose dimension differs depending on the cluster (please look at the specific page of the cluster for more details).
When a job starts, a temporary area is defined on the storage local to each compute node. On MARCONI and GALILEO100:

TMPDIR=/scratch_local/slurm_job.$SLURM_JOB_ID

Differently from the other CINECA clusters, on LEONARDO the job's temporary area is managed by the slurm job_container/tmpfs plugin, which provides an equivalent job-specific, private temporary file system space, with private instances of /tmp and /dev/shm in the job's user space:

TMPDIR=/tmp

visible via the command df -h /tmp . If more jobs share one node, each one will have a private /tmp in the job's user space. The tmpfs are removed at the end of the job (and all data will be lost).

Whatever the mechanism, the TMPDIR can be used exclusively by the job's owner. During your jobs, you can access the area with the (local) variable $TMPDIR. In your sbatch script, for example, you can move the input data of your simulations to the $TMPDIR before the beginning of your run and also write on $TMPDIR your results. This would further improve the I/O speed of your code.

However, the directory is removed at the job's end; hence always remember to save the data stored in such area to a permanent directory in your sbatch script at the end of the run. Please note that the area is located on local disks, so it can be accessed only by the processes running on the specific node. For multinode jobs, if you need all the processes to access some data, please use the shared filesystems $HOME, $WORK, $CINECA_SCRATCH.

Differently from the other CINECA clusters, thanks to the job_container/tmpfs plugin the local storage is considered a "resource" on LEONARDO, and can be explicitly asked on the diskful nodes only (DCGP and serial nodes) via the sbatch directive or srun option "-gres=tmpfs:XX" (see the specific Disks and Filesystems section on LEONARDO's User Guide for the allowed maximum values). For the same reason, the requested amount of gres/tmpfs resource contributes to the consumed budget, changing the number of accounted equivalent core hours, see the dedicated section on the Accounting on CINECA clusters.

$PUBLIC: user specific, permanent, shared (LEONARDO ONLY)

$PUBIC is a shared area for collaborative work within a given project. Files in $PUBLIC will be conserved up to 6 months after the project end, and then they will be cancelled. Please note that there is no back-up in this area.

$DRES: permanent, shared (among platforms and projects)

This is a repository area for collaborative work among different projects and across platforms. You need to explicitly ask for this kind of resource: it does not come as part of a project (mailto: superc@cineca.it).

File retention is related to the life of DRES itself. Files in DRES will be conserved up to 6 months after DRES completion, then they will be cancelled. Several types of DRES are available, at present:

  • FS: normal filesystem access on high throughput disks, shared among all HPC platforms (mounted only on login nodes);
  • ARCH: magnetic tape archiving with a disk-like interface via LTFS;
  • REPO: smart repository based on iRODS.

This area is conceived for hosting data files to be used by several projects, in particular, if you need to use them among different platforms. For example, you would need to post-process data produced on Marconi, taking advantage of the visualization environment of Galileo; or you would require a place for your data from experiments to be processed by several related projects. This filesystem is mounted only on login nodes and on the nodes of the "serial" partitions of all HPC clusters in Cineca (e.g. bdw_all_serial on Marconi, gll_all_serial on Galileo) - please make use of batch jobs on the serial partitions for the rsync transfers of great amounts of data, or to gridftp clients. As a consequence, you have to move data from $DRES to $CINECA_SCRATCH or $WORK area in order for them to be seen by the compute nodes.

WARNING: DRES of type ARCH have a limit in the number of inodes available: 2000 inodes for each TB of quota. This means that no more than 2000 files can be stored in 1 TB of disk space. It is then recommended that you compress your files for storage purposes and that the dimension of each file stored should be in an average quota of 500MB. DRES of types FS and REPO do not have this limitation.

Files stored in DRES of type FS or ARCH will be moved automatically to tape storage, when specific conditions are met:

- for ARCH: files are older than 3 months and bigger than 50 MB
- for FS: files are older than 3 months and bigger than 100 MB

The Data movement is transparent for the user. Only physical support changes, while the logical environment will not be affected (this means that you can reach data stored in tape using the same path you used for storing it in the first place).


Symmary about $DRES

  • Created on request.

  • Non linked to a specific project.

  • All collaborators can write.

  • Data are accessible on all platforms but visibile only on login nodes.
  • Compute nodes cannot access to data in $DRES

  • By default files are private.
  • The user can change the permission (chmod) and make files visible (R o R/W) to project collaborators.
  • Quota based on the needs. A limit of 2000 files each TB is present.

  • Data are preserved up few months after the expiring date
  • No Backup

Backup Policies and Data Availability

  • Daily backups guarantee the $HOME filesystem. In particular, the daily backup procedure preserves a maximum of three different copies of the same file. Older versions of files are kept for 1 month. The last version of deleted files is kept for 2 months, then definitely removed from the backup archive. Different agreements about Backup policies are possibile. For more information contact the HPC support (superc@cineca.it).
  • Data, both backed up and non-backed up, are available for the entire duration of the project. After a project expires, users will still have full access to the data for an additional six months. Beyond this six-month period, data availability is not guaranteed.  A scheme of data availability is reported in the figure below.

Important: Users have responsibility to backup their important data !!!

Monitoring the occupancy

The occupancy status of all areas accessible to a user, along with the storage quota limits, can be monitored using a simple command available on all HPC cluster. There are two commands named cindata , cinQuota (only for Galileo 100 and Leonardo). For both commands the flag -h can be used to show the help. In the follwing, an example of cindata  and cinQuota  outputs produced for a DRES user is shown.


$ cindata

USER           AREADESCR                        AREAID                  FRESH    USED      QTA    USED%    aUSED    aQTA    aUSED%

myuser00  /gpfs/work/<AccountName>       galileo_work-Acc-name          9hou     114G      --      --%      14T      30T    48.8%

myuser00  /gpfs/scratch/                 galileo_scr                    9hou     149G      --      --%     341T     420T    81.2%

myuser00  /galileo/home                  galileo_hpc-home               9hou     5.7G      50G     11.4%    16T       --      --%

myuser00  /gss/gss_work/DRES_myAcc       work_OFFLINE-DRES_myAcc-FS     9hou     2.9G      --      --%      11T      15T    73.3%

myuser00  /gss/gss_work/DRES_myAcc       work_ONLINE-DRES_myAcc-FS      9hou     1.2T      --      --%     2.8T       4T    70.0%

Interpreting the storage status can be complex. Here's a breakdown:

  • OFFLINE area: This represents DRES  data that has been stored on tape after three months of storage (see DRES description above).
  • ONLINE area: This represents DRES  data that is still in the filesystem (FS) or ARCH  area.

The total storage quota assigned to your DRES  is indicated by the aQTA  parameter in the OFFLINE  line.

When the DRES  is empty, the ONLINE  value will be the same as OFFLINE . As files begin to be moved to tape, the ONLINE  value will decrease, while the aUSED  parameter in OFFLINE  will increase accordingly. This indicates that you have less space available for storing new data since some of the used space has been moved to tape.

Similarly, deleting offline data will decrease the aUSED  parameter in OFFLINE  and increase the aQTA  parameter in ONLINE  by the same amount.

Remember this formula:

TOTAL DRES STORAGE = aQTA-OFF = aQTA-ON + aUSED-OFF


The additional tool for monitoring the disk occupancy is named cinQuota . A typical output of the command will contain the following information:

$ cinQuota

-----------------------------------------------------------------------------------------------------------------------------------

     Filesystem                                  used          quota         grace         files

------------------------------------------------------------------------------------------------------------------------------------

 /g100/home/userexternal/myuser00              	 22.66G            50G         -            194295

 /g100_scratch/userexternal/myuser00         	 1.955T             0k         -             41139

 /g100_work/<AccountName>                      	 366.3G             1T         -            548665

-----------------------------------------------------------------------------------------------------------------------------------
  

Note

The tools are available in the module cintools, which is automatically loaded in your environment. However, the module can be unloaded as all the other modules (Modules).

File permissions

As explained above, $WORK and $DRES are environmental variables automatically set in the user environment.

  • $WORK variable points to a directory (fileset) specific for one of the user projects:

/gpfs/work/<account_name>  
  • $DRES variable points to space where all of the dres are defined:

/gss/gss_work/

In order to use a specific dres type the following path:

$DRES/<dres_name>

The owner of the root directory is the "Principal Investigator" (PI) of the project or the "owner" of the DRES, the group corresponds to the name of the project or the name of the DRES. Default permissions are:

own: rwx
group: rwx
other: - 

In this way, all project collaborators, sharing the same project group, can read/write into the project/dres fileset, whereas other users can not.

Users are advised to create a personal subdirectory under $WORK and $DRES. By default, files into the subdirectory are private, but the owner can easily share the files with the other collaborators by opening the subdirectory:

chmod 777 mydir
chmod 755 mydir

Since the $WORK/$DRES fileset is closed to non collaborators, the data sharing is active only among the project collaborators.Pointing $WORK to a different project: the chprj  command

Pointing $WORK to a different project: the chprj command

The user can modify the project pointed to by the variable $WORK using the "change project" command.

To list all your accounts (both active or completed) and the default project:

chprj -l

To set $WORK to point to a different <account_no> project:

chprj -d <account_no>

More details are in the help page of the command:

chprj -h
chprj --help

On LEONARDO only: The command applies to the $FAST variable as well.

Info: A comprehensive discussion on how to manage your data can be found in a specific document.

Endianness

Endianness is the attribute of a system that indicates whether integers are represented from left to right or right to left. At present, all clusters in Cineca are "little-endian".

  • No labels