Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Model: DUal-Socket Dell PowerEdge
Architecture: Linux Infiniband Cluster
Nodes: 638636 (+10 login nodes)
Processors:2xCPU x86 Intel Xeon Platinum 8276-8276L
(24c, 2.4Ghz)
Cores:48 cores/node
Accelerators: 2xGPU nVidia V100 PCIe3 with 32GB RAM on 36 Viz nodes
RAM: 384GB (+ 3.0TB Optane on 180 fat nodes)
Internal Network: Mellanox Infiniband 100GbE
Peak performance single node: 3.53 TFlop/s
Peak performance - total : about 2 PFlop/s



Starting from March 2021 Galileo was turned off to make space for the new more performant structure Galileo100.
The new Infrastructure is co-funded by the European ICEI (Interactive Computing e-Infrastructure) project, it is a system engineered by DELL.

Starting from November 2022 two additional rack for a total of 82 "thin nodes" were added to the cluster.


System Architecture


Compute Nodes:

  • 638 636 computing nodes each 2 x CPU Intel CascadeLake 8260,  with 24 cores each, 2.4 GHz, 384GB RAM, subdivided in:
      • 424 422 standard nodes ("thin nodes") 480 GB SSD
      • 180  data processing nodes ("fat nodes") 2TB SSD, 3TB Intel Optane
      •   34   GPU GPU nodes  (visualization "viz" ) with 2x NVIDIA GPU V100 with 100Gbs Infiniband interconnection and 2TB SSD.
  • 77 computing servers OpenStack for cloud computing (ADA CLOUD), 2x CPU 8260 Intel CascadeLake, 24 cores, 2.4 GHz, 768 GB RAM, with 100Gbs Ethernet interconnection.
  • 20 PB of active storage accessible from both cloud and HPC nodes.
  • 1 PB Ceph storage for Cloud (full NVMe/SSD)
  • 720 TB fast storage (IME DDN solution)

Login and Service nodes:

10 login nodes and 5 service nodes. All the nodes are interconnected through an Infiniband network, with OPA v10.6, capable of a maximum bandwidth of 100Gbit/s between each pair of nodes.

Accounting

For more information about accounting, please consult our dedicated section.

Budget Linearization policy

On GALILEO100 a linearization policy for the usage of project budgets has been defined and implemented. For each account, a monthly quota is defined as:

monthTotal = (total_budget / total_no_of_months)

Starting from the first day of each month, the collaborators of any account are allowed to use the quota at full priority. As long as the budget is consumed, the jobs submitted from the account will gradually lose priority, until the monthly budget (monthTotal) is fully consumed. At that moment, their jobs will still be considered for execution, but with a lower priority than the jobs from accounts that still have some monthly quota left.

This policy is similar to those already applied by other important HPC centers in Europe and worldwide. The goal is to improve the response time, giving users the opportunity of using the cpu hours assigned to their project in relation to their actual size (total amount of core-hours).


...