You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Package: Quantum-ESPRESSO

G100

version 7.1

CNT10POR8 : R&G scaling on the CPU

  • Performance analysis for a a Carbon nanotube functionalized with two porphyrine molecules, about 1500 atoms, 8000 bands, 1 k-point
  • The average time per iteration is reported as a function of the number of nodes.


Table1: The performance of the Pure MPI

N° nodes

Time (s)

8248
16134
3275
6448



Graphic 1: the QE performance (simulation time in s) is reported vs. the increasing number of nodes

Leonardo

version 7.2

CNT10POR8 : R&G scaling on the GPUs

  • Performance analysis for a a Carbon nanotube functionalized with two porphyrine molecules, about 1500 atoms, 8000 bands, 1 k-point.
  • The average time per iteration is reported as a function of the number of nodes.


Table2: The performance of the MPI (1 task per GPU) + GPU (4 per node) + OpenMP (8 threads per task)

N° nodes

Time (s)

821.34
1614.06
2012.18
2411.60
 






Graphic 2: the QE(v7.2) performance (simulation time in s) is reported vs. the increasing number of nodes

QuantumESPRESSO PWSCF v7.2

GPUs strongly improve the time to solution, but scaling with R&G has little efficiency beyond the minimum number of GPUs to be used for memory constraints.   












© Copyright 2

  • No labels