...
(updated November 2022)
Galileo100
G16 (relC.01, relC.0102)
Benchmark performed in production time. One replica per job.
...
(DFT / 3-21G) + Force on Valinomicyn
#p rb3lyp/3-21g force test scf=novaracc (mem=Default)
Parallelism (number cores) | Cpu-time (s) (C.01) | Cpu-time (s) (C.02) |
---|---|---|
1 | 2890 | 2600 |
4 | 770 | 721 |
8 | 423 | 385 |
16 | 228 | 215 |
32 | 150 | 127 |
36 | 138 | 111 |
46 | 106 | 97 |
48 | 105 | 92 |
Marconi A3 (intel SKL)
node exclusive,
48 core/node,
182.000 MB memory/node (usable)
...
Parallelism | Cpu-time | Elapsed-time |
serial small_M | 0:54:35 | 0:54:48 |
serial large_M | 0:54:40 | 0:54:53 |
4 procs | 1:01:29 | 0:16:03 |
8 procs | 1:09:41 | 0:09:28 |
16 procs | 1:29:15 | 0:06:24 |
32 procs | 2:14:25 | 0:05:00 |
46 procs | 3:16:03 | 0:05:06 |
48 procs | 3:25:18 | 0:05:08 |
Ref (4 cores) | 1:02:26 | 0:16:30 |
Marconi A2 (KNL)
Node exclusive,
68 core/node,
86.000 MB memory/node (usable)
Test0397
(DFT / 3-21G) + Force on Valinomicyn
#p rb3lyp/3-21g force test scf=novaracc (mem=86.000MB)
Parallelism | Cpu-time | Elapsed-time |
serial | 0:57:47 | 0:57:57 |
4 procs | 4:29:28 | 1:07:26 |
8 procs | 4:32:54 | 0:34:11 |
16 procs | 1:07:46 | 0:04:17 |
32 procs | 5:27:57 | 0:10:20 |
68 procs | 7:38:03 | 0:08:35 |
Ref (4 cores) | 1:03:00 | 0:16:00 |
Test0590
(Local Spin Density Approx)
#p lsda/gen/auto opt=(modred,expert) test (mem=86000MB)
Parallelism | Cpu-time | Elapsed-time | serial small_M | serial large_M |
4 procs | 5:01:57 | 1:17:50 | |
8 procs | 6:03:55 | 0:48:13 | |
16 procs | 32 procs | 10:38:32 | 0:22:09 |
60 procs | 17:35:43 | 0:19:45 | |
68 procs | Ref (4 cores) | 1:02:26 | 0:16:30 |
GALILEO (BDW)
nodes shared,
36 core/node,
118.000 MB memory/node (usable)
Test0397
(DFT / 3-21G) + Force on Valinomicyn
#p rb3lyp/3-21g force test scf=novaracc (mem=86.000MB)
...
Parallelism
...
Cpu-time
...
Elapsed-time
...
serial
...
0:43:06
...
0:43:10
...
4 procs
...
0:57:31
...
0:14:24
...
8 procs
...
1:04::17
...
0:08:05
...
16 procs
...
1:05:42
...
0:04:08
...
32 procs
...
1:16:39
...
0:02:25
...
36 procs
...
1:18:58
...
0:02:14
...
Ref (4 cores)
...
1:03:00
...
0:16:00
Test0590
(Local Spin Density Approx
#p lsda/gen/auto opt=(modred,expert) test (mem=200MB)
...
Parallelism
...
Cpu-time
...
Elapsed-time
...
serial small_M
...
0:40:39
...
0:41:16
...
serial large_M
...
0:39:50
...
0:40:10
...
4 procs
...
0:57:57
...
0:15:05
...
8 procs
...
1:02:16
...
0:08:17
...
16 procs
...
1:28:54
...
0:06:10
...
32 procs
...
2:19:38
...
0:04:59
...
36 procs
...
2:29:46
...
0:04:44
...
Ref (4 cores)
...
1:02:26
...
0:16:30
Old systems on test397
CPU/Elapsed on different systems
serial | 4 procs | 8 procs | 16 procs | 32 procs | 36 procs | 46 procs | 48 procs | 68 procs | Ref | 1:03:00 | MARCONI | 0:58:24 | 0:59:30 | 1:02:41 | 1:07:53 | 1:20:24 | 1:30:31 | 1:30:31 | 1:33:49 | MARCONI | 0:57:47 | 4:29:28 | 4:32:54 | 1:07:46 | 5:27:57 | 7:38:03 | GALILEO | 0:43:06 | 0:57:31 | 1:04::17 | 1:05:42 | 1:16:39 | 1:18:58 | BCX | 1:54:00 | 2:32:00 | SP5 | 1:39:00 | 1:39:00 | 1:45:00 | SP6 | 1:09:00 | 1:11:00 | 1:16:00 | PLX | 0:52:00 | 1:02:00 | EURORA | 0:32:00 | 0:38:00 | 0:41:00 | 0:46:00 |