Cineca provides the user with an easy tool to establish a graphic session with our systems: RCM. All the software that comes with a graphic user interface (GUI) can be used within a RCM session. In this regard, Totalview makes no exception, and can be easily used in conjunction with RCM to establish a debugging session of a parallel code. With respect to other GUIs that can be run on RCM, Totalview is a little peculiar and must be run directly on the nodes that execute the parallel code. In the following, we will detail how to establish a Totalview debugging session through RCM with a PBS job.
Please refer to this page for the instructions on how to use RCM; for most of the cases is as simple as: 1) download the tool, 2) launch the executable.
Once you have established a connection through RCM with one of our system, FERMI or Galileo please follow the instructions below.
Once connected you should have a desktop session open. Now open a terminal following "Applications -> System Tools -> Terminal". When done, a terminal pops-up and you can use it as you do normally with a ssh connection. Now let's go through the operations required to launch a Totalview job.
1) Get the DISPLAY number
On the terminal just opened please type:
and take note of the value obtained as output of the command. For example:
$ echo $DISPLAY
2) Prepare the job
An example of job that launches a debugging session with Totalview is:
# @ bg_size = 64
# @ wall_clock_limit = 00:20:00
# @ job_name = poisson
# @ job_type = bluegene
# @ notification = never
# @ output = $(job_name).$(jobid).out
# @ error = $(output)
# @ account_no = your_account_here
# @ queue
# set the DISPLAY in order to use the same opened by RCM
# DISPLAY has to be set to "fen02:value" where value
# can be found by typing "echo $DISPLAY" on a RCM terminal
module load totalview
totalview runjob -a --ranks-per-node 1 --np 64 --verbose 0 : ./poisson.exe
In bold, in the above example, we have told the Totalview user interface to open on the current VNC session (opened automatically by RCM). Please refer to the above section on how to get the correct DISPLAY number.
3) Submit the job
Now you can submit the job just prepared. Once it becomes running, the Totalview user interface will pop-up and you will be able to debug your code.
Once connected to GALILEO with RCM, please open a terminal (start -> terminal). Then follow this set of instructions described below.
1) Prepare the job (job.sh script)
#PBS -l walltime=00:30:00
#PBS -l select=1:ncpus=4:mpiprocs=4:mem=15gb
#PBS -N totalview
#PBS -o job.out
#PBS -e job.err
#### account number (type saldo -b)
#PBS -A your_account_here
module load autoload <openmpi|intelmpi> #select the compiler used to debug your program
module load totalview
totalview mpirun -a poisson.exe -n 4
2) Submit the job
Submit the job and pass the variable DISPLAY to the execution nodes.
qsub -v DISPLAY=`hostname`$DISPLAY job.sh
PS: In a terminal opened inside RCM, the shortcut to paste text copied elsewhere is "Ctrl+Shift+Insert"