Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 If the environmental variable is not set every task will write the same gmon.out file.

GPU profilers (Nvidia Nsight System)

Nvidia Nsight System is a system-wide performance analysis tool designed to visualize an application’s algorithms, help you identify the largest opportunities to optimize, and tune to scale efficiently across any quantity or size of CPUs and GPUs; from large server to our smallest SoC.
You can find general info on how to use it in the dedicated Nvidia User Guide pages.

On Marconi100 only the Command Line Interface (CLI) is available since the GUI does not support Power9 nodes. Our suggestion is to run the CLI inside your job script in order to generate the qdrep files. Then you can download the qdrep files on your local PC and visualize them with the Nsight System GUI available on your workstation.

Standard usage of an MPI job running on GPU is

> mpirun <options> nsys profile -o ${PWD}/output_%q{OMPI_COMM_WORLD_RANK} -f true --stats=true --cuda-memory-usage=true <your_code> <input> <output>

Unfortunately nsys usually generates several files in /tmp dir of the compute node even if a TMPDIR environment variable is set. These files may be big causing the filling of the /tmp folder and causing the crash of the compute node and the failing of the job.
In order to avoid such a problem we strongly suggest to include in your sbatch script the following lines around your mpirun call as a workaround:

> rm -rf /tmp/nvidia
> ln -s $TMPDIR /tmp/nvidia
> mpirun ... nsys profile .....
> rm -rf /tmp/nvidia

This will place the temporary outputs of the nsys code in your TMPDIR folder that by default is /scratch_local/slurm_job.$SLURM_JOB_ID where you have 1 TB of free space.
This workaround may cause conflicts between multiple jobs running this profiler on a compute node at the same time, so we strongly suggest also to reques the compute node exclusively:

#SBATCH --exclusive


MPI environment

We offer two options for MPI environment on Marconi100:

...