Introduction
globus-url-copy a scriptable command line tool that can do multi-protocol data movement supporting GridFTP. It is mainly for Linux/Unix users. It is possible to use globus-url-copy in these cases:
- User Local PC <==> CINECA HPC Cluster
- User Local PC <==> iRODS repository
- CINECA HPC Cluster A <==> CINECA HPC Cluster B
- CINECA HPC Cluster <==> iRODS repository
The following steps help you to transfer easily your data from/to CINECA cluster using globus-url-copy.
Get X.509 personal certificate
Create your proxy credential
- "CASE 1)" if you have requested a certificate to your certification authority
- or "CASE 2)" (down in this page) if you have requested a certificate to CA-CINECA .
Convert your certificate in a "pem" certificate. If it is a ".p12" or a ".pfx", please convert it by typing
bash$ openssl pkcs12 -clcerts -nokeys -in cert.{p12|pfx} -out usercert.pem
Enter Import Password: <password used for backup of your .p12 certificate>
MAC verified OK
bash$ openssl pkcs12 -nocerts -in cert.{p12|pfx} -out userkey.pem
Enter Import Password: <password used for backup of your .p12 certificate>
MAC verified OK
Enter PEM pass phrase: <password to encrypt your private key>
Verifying - Enter PEM pass phrase <password to encrypt your private key>
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Set the right permission to the file just created:
bash$ chmod 644 usercert.pem
bash$ chmod 400 userkey.pem
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
bash$ openssl x509 -in usercert.pem -noout -subject | sed 's/subject= //'
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
To use globus-url-copy on MARCONI, PICO, GALILEO and MARCONI clusters, the extracted DN has to be added to our userdb profile (https://userdb.hpc.cineca.it/user) under the "personal data" section in the "X.509 certificate" field and following the specified syntax. To use globus-url-copy with the iRODS repository, you have been add as PI or collaborator to a DRES of type REPO.
Create the directory ~/.globus and copy here the usercert.pem ed userkey.pem.
bash$ mkdir ~/.globus
bash$ cp <some location>/usercert.pem ~/.globus
bash$ cp <some location>/userkey.pem ~/.globus
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Install the package myproxy, if the login machine isn't a CINECA HPC cluster. The installation packages are available through YUM and APT repositories for several platforms.
After the installation of myproxy package, type:
bash$ mkdir ~/.globus/certificates && cd ~/.globus/certificates
bash$ wget https://winnetou.surfsara.nl/prace/certs/globuscerts.tar.gz && tar -xzvf globuscerts.tar.gz
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
case 1.1) if both your X.509 certificate and the source data are in the same machine, create your proxy certificate (starting from your X.509 certificate) by using the command:
grid-proxy-init
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
case 1.2) if the X.509 certificate and the source data are on two different machines, login on the machine where the X.509 certificate is located and create the proxy by executing the command:
grid-proxy-init
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Then store the proxy on the grid.hpc.cineca.it myproxy-server by typing the command
myproxy-init -l <username> -s grid.hpc.cineca.it
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Finally, login into the machine where the source data are located and retrieve the proxy certificate by the command
myproxy-logon -l <username> s grid.hpc.cineca.it
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
When you finish to use your proxy credential, destroy it by typing:
myproxy-destroy -s grid.hpc.cineca.it -l <username>
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
grid-proxy-destroy
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
NB The proxy will destroy itself 12 hours running from its "init". So after this time you have to create again the proxy for a new transfer. If you want to increase the proxy lifetime, use parameter "-t <hours>" in the myproxy-init command.
CASE 2) if you have requested a certificate to CA-CINECA
Login into the machine where your source data are located.
case 2.1) If the login machine is a CINECA HPC cluster, retrieve your proxy credential with the command
myproxy-logon -s grid.hpc.cineca.it -l <username>
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
where <username> and password are the same that you have on HPC CINECA machines.
case 2.2) If the login machine isn't a CINECA HPC cluster, install the package myproxy. The installation packages are available through YUM and APT repositories for several platforms.
After the installation of myproxy package, type:
bash$ mkdir ~/.globus/certificates && cd ~/.globus/certificates
bash$ wget https://winnetou.surfsara.nl/prace/certs/globuscerts.tar.gz && tar -xzvf globuscerts.tar.gz
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Download the file 467aa3eb.0 and 467aa3eb.signing_policy and digit the command
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
to obtain the hash of the certificate. Now, rename the two file downloaded as <hash>.0 and <hash>. signing_policy and cp these two file in ~/.globus/certificates.
Finally, retrieve your proxy credential
myproxy-logon -s grid.hpc.cineca.it -l <username>
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
where <username> and password are the same that you have on HPC CINECA machines.
myproxy-destroy -s grid.hpc.cineca.it -l <username>
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
NB. The proxy will destroy itself 7 days from its "init". So after this time you have to create again the proxy for a new transfer.
Install the standard client
Once you have the valid proxy certificate, the globus-url-copy client has to be installed on the machine with your source data. On all the CINECA HPC Clusters (MARCONI, PICO, GALILEO and MARCONI) it is already available. On your local PC, if you are in a UNIX environment you can a) install the system package (RPMs and Debian package available) or b) build it from source. The former solution is the most suitable for most users and you can follow the instructions being available here http://www.ige-project.eu/downloads/software/releases/downloads. See also http://www.ige-project.eu/downloads/documents/guide/component-installation-guide for more details.
Use the standard client
Now that you have a valid proxy on the machine with the source data, you can start to transfer your data.
To transfer file from CINECA HPC Custer to yout Local PC:
$ globus-url-copy gsiftp://[username@]<ENDPOINT-CINECA HPC Cluster>/<remote_path/to/yourfile> file:///home/user/<local_path/to/yourfile>
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
To transfer file from your local PC to CINECA HPC Custer
$ globus-url-copy /path/to/your/local/file gsiftp://[username@]<ENDPOINT-CINECA HPC Cluster>/remote/path/
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
where the ENDPOINT-CINECA HPC Clusters are:
iRODS repository --> gftp.repo.cineca.it:2811
MARCONI machine --> gftp.marconi.cineca.it:2811
MARCONI machine for PRACE users --> gftp-prace.marconi.cineca.it:2811
PICO machine --> gftp.pico.cineca.it:2811
PICO machine for PRACE users --> gftp-prace.pico.cineca.it:2811
GALILEO machine --> gftp.galileo.cineca.it:2811
MARCONI-100 machine --> gftp.m100.cineca.it:2811
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Examples
1- for synching recursively a directory and its subdirectories to MARCONI (like with rsync)
$ globus-url-copy -cd -r -sync /path/to/your/dir/ gsiftp://[username@]gftp.marconi.cineca.it:2811/~/remote/dir/
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
2- for moving a big chunk of data from MARCONI to PICO, the parallel option can be used
$ globus-url-copy -p 4 gsiftp://[username@]gftp.marconi.cineca.it:2811/~/path/to/file gsiftp://gftp.pico.cineca.it:2811/~/path/
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
3- for listing the file in your directory on iRODS repository.
$ globus-url-copy -list gsiftp://gftp.repo.cineca.it:2811/CINECA01/home/your-remote-dir/
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.
Gridftp transfer via batch script
To perform a gridftp transfer longer than 10 cpu minutes, it is suggested to submit a job batch on the serial queue. In what follows the list of needed steps
Create a proxy certificate in a location available from all the nodes of the cluster (e.g. your $HOME directory)
$ grid-proxy-init -out $HOME/proxy.cert
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.
Please read this document to get more information about the newly released Data Center version.Below an example of job script:
#!/bin/bash
#SBATCH --err=slurm_%J.err#SBATCH --out=slurm_%J.out
#SBATCH --time=04:00:00 #max time 4h
#SBATCH --nodes=1 --ntasks-per-node=1 --cpus-per-task=1
#SBATCH --partition=bdw_all_serial #on GALILEO the partition is gll_all_serial, on DAVIDE the partition is dvd_all_serial
globus-url-copy -cred $HOME/proxy.cert <file in local position> gsiftp://<gridftp endpoint>/remote/path/No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.
Please read this document to get more information about the newly released Data Center version.