You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »


This page describes how to use the rsync function to transfer files from, to and between CINECA HPC machines. Rsync parameters shown here are tuned for CINECA internal network. Please refer to the rsync manual (man rsync) if you look for different optimizations.

The most important advantage of choosing rsync instead of scp is the possibility to restart the file transfer from the point it was interrupted (in case of connection problems or time limits of the jobs) by relaunching the command without starting again from zero.

There are two different ways to use rsync that will be described below in detail:

  • via batch job or interactive session: it is the recommended choice when transferring files between two machines with public IP (CINECA HPC but also to machines different from CINECA). You will use a dedicated queue with a time limit of 4 hours.
  • via command line on login nodes or on your pc: the only possibility when transferring files to/from machines without public IP (e.g. your personal PC). Regardless you run the command on our clusters or on your personal PC the command will create a rsync process on a login node of pur cluster with a time limit of 10 minutes. As a consequence after about 10 minutes the transfer will be killed. This choice is very useful and quick for small files but difficult for files larger than 10 GB. This is valid for all our clusters except Marconi100 for which we have set up a different solution.

For very large data set (>~ 500GB), we strongly suggests using Globus online via GridFTP protocol.

Rsync via batch job or interactive session

This is the recommended solution in case you would like to move or retrieve data from other CINECA HPC machines or other places with a public IP.
By using the serial queue you have up to 4 hours to complete your data transfer for each job. In addition by using the serial queue you will not consume your budget.

Interactive session

When running rsync command you need to insert the password if you don't have exchanged a public key between the two clusters.
You can open an interactive session:

srun -N1 -n1 --cpus-per-task=1 -A <account_name> -p <serial_queue_name> --time=04:00:00 --pty /bin/bash

then run your rsync command like in the dedicated section and insert the password when requested.

Single job

If you have exchanged a public key between the two clusters (see How to connect by a public key) you don't have to type the password.

WARNING: it is not recommended to leave private keys for a long time on public clusters because in case of security breach they can be stolen and used to move towards other clusters. Therefore we recommend to use rsync between clusters via interactive session and use this possibility only when really needed. We strongly recommend to remove the private key after the data transfer is completed.

In this case you can submit a job script like in the example below where we move data to one of our HPC machines:

################ serial queue 1 cpu ##########

#!/bin/bash
#SBATCH --out=job.out
#SBATCH --time=04:00:00
#SBATCH --nodes=1 --tasks-per-node=1 --cpus-per-task=1 --mem=4096
#SBATCH --account=<account name> ##you can find the name of the account by using "saldo -b" command
#SBATCH --partition=<serial queue name> ##you can find the name of the dedicated queue by using "sinfo|grep serial" command
#
cd <data_path_to>
rsync  -PravzHS </data_path_from/dir> <username>@login.<hostname>.cineca.it:<data_path_to>            

##########################################


You can ask for more than a single cpu (--cpu-per-tasks) and execute on each cpu an rsync command for different data.
Example:

################ serial queue 2 cpu ##########

#!/bin/bash
#SBATCH --out=%j.job.out
#SBATCH --time=04:00:00
#SBATCH --nodes=1 --tasks-per-node=1 --cpus-per-task=2 --mem=4096
#SBATCH --account=<account name> ##you can find the name of the account by using "saldo -b" command
#SBATCH --partition=<serial queue name> ##you can find the name of the dedicated queue by using "sinfo|grep serial" command
#
cd <data_path_to>
rsync -PravzHS  <username>@login.<hostname>.cineca.it:</data_path_from/dir1> <data_path_to1>  &
rsync -PravzHS  <username>@login.<hostname>.cineca.it:</data_path_from/dir2> <data_path_to2>  &
wait
########################################

Chaining multiple jobs

If your data copy requires more than 4 hours you can run concatenated jobs taking advantage of the fact that you can interrupt and restart the file transfer with rsync.
Each concatenated job has up to 4 hours of time limit in order to copy the data
starting from the file where the previous job was interrupted.


$sbatch job1.cmd
submitted batch job 100
$sbatch --dependency=afternotok:100 job2.cmd
submitted batch job 102

where job1.cmd and job2.cmd are job scripts like the ones shown above.
The available options for -d or --dependency are:
afterany:job_id[:jobid...], afternotok:job_id[:jobid...], afterok:job_id[:jobid...], ... etc..
See the sbatch man page for additional details.

Rsync via command line

You can launch rsync via command line on login nodes in the following way:

------ CINECA /space1/ <-> CINECA /space2/ ----------------------
rsync -PravzHS </data_path_from/file> <data_path_to>

------ CINECA /HPC machine 1/ <-> CINECA /HPC machine 2/ ----------------------
rsync -PravzHS </data_path_from/file> username@login.<hostname>.cineca.it:<data_path_to>

------ CINECA -> LOCAL/HPC machine ----------------------

rsync -PravzHS username@login.<hostname>.cineca.it:</data_path_from/dir> <data_path_to>

------ LOCAL/HPC machine -> CINECA ----------------------
rsync -PravzHS <data_path_from/dir> username@login.<hostname>.cineca.it:<data_path_to>

We remind again that, on CINECA's cluster, the maximum cpu time available via the command line is 10 min. If your rsync connection will be killed after this time (i.e for big file >10 GB) and your transfer has not been completed ri-execute rsync command until the data transfer will be completed.

Dedicated node for Marconi100

For Marconi100 in order to avoid the 10 minutes limit we set up a dedicated node accessible with a dedicated alias.
In this case you can use the command:

rsync -PravzHS </data_path_from/file> <your_username>@data.m100.cineca.it:<data_path_to>
rsync -PravzHS <your_username>@data.m100.cineca.it:<data_path_from/file> </data_path_to> 


  • No labels