Attention

DDT in no more available on HPC clusters in Cineca

 

This document describes how to run a (remote) DDT session on CINECA's PLX machine.

Allinea DDT, the Distributed Debugging Tool, is the most advanced debugging tool available for scalar, multi-threaded and large-scale parallel applications.

Allinea DDT is an intuitive, scalable, graphical debugger. DDT can be used as a single-process or a multi-process program (MPI) debugger. Both modes of DDT are capable of debugging multiple threads, including OpenMP codes.

DDT provides all the standard debugging features (stack trace, breakpoints, watches, view variables, threads etc.) for every thread running as part of your program, or for every process - even if these processes are distributed across a cluster using an MPI implementation.

C, C++, Fortran and Fortran 90/95/2003 are all supported by DDT, along with a large number of platforms, compilers and all known MPI libraries. You may also debug GPU programs by enabling CUDA support.


1) Compile your source code with NO optimization flags

For example, get the C program reported in attached files (test.c), copy it in a directory of the PLX machine (in the $CINECA_SCRATCH filesystem).

Load the needed modules:

module load intel
module load openmpi

Since it is a C program using the MPI library, you have to compile it with the "mpicc" compiler:

mpicc -g -O0 -o test.x test.c -i_dynamic

The compilation options used here:

-g         : Generates debug information for debugging tools
-O0        : NO compile time optimizations
-i_dynamic : Remove "warning: feupdateenv is not implemented and will always fail"

should be considered MANDATORY for successfull debugging sessions with DDT.

2) Configure DDT session (only the first time)

Since DDT has got a graphycal user interface, at this point you need to connect the PLX server with an "Xterm" session.

To configure you DDT session you can do as follows:

  • Get the DDT batch script template reported in attached files (tmpl), copy it in a directory of the PLX machine (in the $CINECA_SCRATCH filesystem for example).
  • Open the template and fix the option with your account_no
#PBS -A <my_account>
  • Load the DDT module:
module load ddt
  • Run the DDT debugger:
ddt &
  • In the "Configuration Wizard", select "Create a new configuration file";
  • Select your preferred MPI implementation ("generic" in this case);
  • Enter you ssh password;
  • In the "Job Scheduling" window select "Configure submission of jobs through a job scheduler"
  • Set the "Job Submission Settings" windows in the following way:
"Submission template file": $CINECA_SCRATCH\tmpl
"Submit command": qsub
"Regexp for job id": .+\.
"Cancel command": qdel JOB_ID_TAG
"Display command": qstat
  • Skip the "Site wide configuration" step.

You are now ready to run an interactive or batch DDT session.

3a) Run an interactive DDT job - Serial Job

  • Load the DDT module:
module load ddt
  • Load system libraries required by your program:
module load intel openmpi
  • Run the DDT debugger:
ddt &

In the "DDT welcome" windows click "Manually launch a program";

in the next window select the needed options (the default choise if fine for this tutorial) and click the "Listen" button;

  • In the shell type launch the executable to be debugged inside the DDT client:
ddt-client test.x

and the DDT session will start.

For more information about DDT see http://allinea.com/downloads/userguide.pdf

3b) Run a batch DDT job - Serial and Parallel Job

  • Load the DDT module:
module load ddt
  • Run the DDT debugger:
ddt &

In the "DDT welcome" windows click "Run and debug a program";

In the "Run" window set your application name and its arguments (test.x and no arguments for this tutorial), the wall clock limit and the queue, the number of process, etc. (default vaules are fine for this tutorial);

  • Click "submit" and the DDT session will start.

For more information about DDT see http://allinea.com/downloads/userguide.pdf

Test_ddt
// This is an MPI program in C for testing Allinea DDT

#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[])
{
  int rank, size;
  float send, recv;
  int left, right;
  MPI_Status status;


  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  printf("I am %d and we are in %d \n", rank, size);

  left = rank-1;
  right = rank+1;
  if(rank == 0) left = size-1;
  if(rank == size-1) right = 0;

  send = (float)rank;
  recv = -1;
  MPI_Send(&send, 1, MPI_FLOAT, right, 1, MPI_COMM_WORLD);
  MPI_Recv(&recv, 1, MPI_FLOAT, left , 1, MPI_COMM_WORLD, &status);

  printf("I am %d and I recv    %f \n", rank, recv);


  MPI_Finalize();

  exit(EXIT_SUCCESS);

}

 

Batch Script
#!/bin/bash
#PBS -o test_ddt.out
#PBS -e test_ddt.err
#PBS -l walltime=0:20:00
#PBS -l select=1:mpiprocs=4
#PBS -q debug
#PBS -A <my_account>
# 

cd $PBS_O_WORKDIR

module load autoload
module load openmpi/1.3.3--intel--11.1--binary 

mpirun test_ddt.x 
#!/bin/bash

# Name: PBS
#
# submit: qsub
# display: qstat
# job regexp: .+\. 
# cancel: qdel JOB_ID_TAG
# use num_nodes: yes
# 
# WALL_CLOCK_LIMIT_TAG: {type=text,label="Wall Clock Limit",default="00:30:00",mask="09:09:09"}
# QUEUE_TAG: {type=text,label="Queue",default=debug}

# DDT will generate a submission script from this by replacing these tags:
#        TAG NAME         |         DESCRIPTION           |        EXAMPLE
# ---------------------------------------------------------------------------
# PROGRAM_TAG             | target path and filename      | /users/ned/a.out
# PROGRAM_ARGUMENTS_TAG   | arguments to target program   | -myarg myval
# NUM_PROCS_TAG           | total number of processes     | 16
# NUM_NODES_TAG           | number of compute nodes       | 8
# PROCS_PER_NODE_TAG      | processes per node            | 2
# NUM_THREADS_TAG         | OpenMP threads per proc       | 4
# DDT_DEBUGGER_ARGUMENTS_TAG | arguments to be passed to ddt-debugger
# MPIRUN_TAG              | name of mpirun executable     | mpirun
# AUTO_MPI_ARGUMENTS_TAG  | mpirun arguments              | -np 4
# EXTRA_MPI_ARGUMENTS_TAG | extra mpirun arguments        | -x FAST=1
#
# Note that NUM_NODES_TAG and PROCS_PER_NODE_TAG are only valid if DDT is
# set to 'use NUM_NODES' in the queue options. If not, they will be replaced
# with the number of processes and 1 respectively.

#PBS -l walltime=WALL_CLOCK_LIMIT_TAG,select=NUM_NODES_TAG:ncpus=PROCS_PER_NODE_TAG
#PBS -q QUEUE_TAG
#PBS -o PROGRAM_TAG-ddt.output
#PBS -e PROGRAM_TAG-ddt.error
#PBS -A <my_account>

cd $PBS_O_WORKDIR

module load autoload
module load openmpi/1.3.3--intel--11.1--binary

if [ NUM_PROCS_TAG = 1 ]
then
    DDTPATH_TAG/bin/ddt-client PROGRAM_TAG PROGRAM_ARGUMENTS_TAG
else
    MPIRUN_TAG AUTO_MPI_ARGUMENTS_TAG EXTRA_MPI_ARGUMENTS_TAG DDTPATH_TAG/bin/ddt-debugger DDT_DEBUGGER_ARGUMENTS_TAG PROGRAM_TAG PROGRAM_ARGUMENTS_TAG
fi

 

  • No labels