(Updated November 2022)
Galileo100
G16 (relC.01, relC.02)
Benchmark performed in production time. One replica per job.
The picture shows the elapsed-time trend (in seconds) varying the number of cores (on 1 Node)
Test0397
(DFT / 3-21G) + Force on Valinomicyn
#p rb3lyp/3-21g force test scf=novaracc (mem=Default)
Parallelism (number cores) | Elapsed-time (s) (C.01) | Elapsed-time (s) (C.02) |
---|---|---|
1 | 2890 | 2600 |
4 | 770 | 721 |
8 | 423 | 385 |
16 | 228 | 215 |
32 | 150 | 127 |
36 | 138 | 111 |
46 | 106 | 97 |
48 | 105 | 92 |
Marconi A3 (intel SKL)
node exclusive,
48 core/node,
182.000 MB memory/node (usable)
Test0397
(DFT / 3-21G) + Force on Valinomicyn
#p rb3lyp/3-21g force test scf=novaracc (mem=86.000MB)
Parallelism | Cpu-time | Elapsed-time |
serial | 0:58:24 | 0:58:35 |
4 procs | 0:59:30 | 0:14:57 |
8 procs | 1:02:41 | 0:07:53 |
16 procs | 1:07:53 | 0:04:17 |
32 procs | 1:20:24 | 0:02:33 |
46 procs | 1:30:31 | 0:02:30 |
48 procs | 1:33:49 | 0:02:30 |
Ref (4 cores) | 1:3:xx | 0:16:00 |
Test0590
(Local Spin Density Approx
#p lsda/gen/auto opt=(modred,expert) test (mem=200MB)
Parallelism | Cpu-time | Elapsed-time |
serial small_M | 0:54:35 | 0:54:48 |
serial large_M | 0:54:40 | 0:54:53 |
4 procs | 1:01:29 | 0:16:03 |
8 procs | 1:09:41 | 0:09:28 |
16 procs | 1:29:15 | 0:06:24 |
32 procs | 2:14:25 | 0:05:00 |
46 procs | 3:16:03 | 0:05:06 |
48 procs | 3:25:18 | 0:05:08 |
Ref (4 cores) | 1:02:26 | 0:16:30 |