Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

> mpirun <options> nsys profile -o ${PWD}/output_%q{OMPI_COMM_WORLD_RANK} -f true --stats=true --cuda-memory-usage=true <your_code> <input> <output>

On the single node you can also run the profiler as "nsys profile mpirun", but keep in mind that with this syntax nsys will put everything in a single report.

Unfortunately nsys usually generates several files in /tmp dir of the compute node even if a TMPDIR environment variable is set. These files may be big causing the filling of the /tmp folder and causing , as a consequence, the crash of the compute node and the failing failure of the job.
In order to avoid such a problem we strongly suggest to include in your sbatch script the following lines around your mpirun call as a workaround:

...

This will place the temporary outputs of the nsys code in your TMPDIR folder that by default is /scratch_local/slurm_job.$SLURM_JOB_ID where you have 1 TB of free space.
This workaround may cause conflicts between multiple jobs running this profiler on a compute node at the same time, so we strongly suggest also to reques request the compute node exclusively:

...