Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In January 2020 the A2(KNL) partition has been shut down and replaced with a new partition with GPU accelerators called MARCONI100. The new cluster is described in a separate document (UG3.1.1)


System A1 (Broadwell) - out of production since September 26th, 2018


Model: Lenovo NeXtScale

Racks: 10
Nodes: 1512 (then reduced to 720)
Processors: 2 x 18-cores Intel Xeon E5-2697 v4 (Broadwell) at 2.30 GHz
Cores: 36 cores/node, 25.920 cores in total
RAM: 128 GB/node, 3.5 GB/core
Peak Performance single node: 1.3 TFlop/s
Peak Performance: about 2 PFlop/s


...

If a graphic session is desired we recommend to use the tool RCM (Remote Connection Manager)For additional information visit Remote Visualization section on our User Guide.

...

and checking the "compilers" section.  In general, the The available compilers are:

  • INTEL

...

  • GNU
  • PGI

The compilation is very simple,:

After loading the appropriate module, for example for intel module

> module load intel

GNU (gcc, g77, g95):

> module load gnu

PGI - Portland Group (pgf77,pgf90,pgf95,pghpf, pgcc, pgCC):

> module load profile/advanced
> module load pgi
intel

for example for the  fortran file 

compile the code running the command:

> ifort [flags] source_file

use the After loading the appropriate module, use the "man" command to get the complete list of the flags supported by the compiler, for example:

> module load intel
> man ifort

There are some flags that are common for all these compilers. Others are more specifics. The most common are reported later for each compiler.

...

  For example

if you want to use a specific library or a particular include file, you have to give their paths, using the following options 

-I/path_include_files specify the path of the include files
-L/path_lib_files -l<xxx> specify a library lib<xxx>.a in /path_lib_files

       2. If yout want to debug your code you have to turn off optimization and turn on run time checkings: these flags are described in the following section.

If you want to compile your code for normal production you have to turn on optimization by choosing a higher optimization level:

-O2 or -O3 Higher optimization levels

Other flags are available for specific compilers and are reported later.

INTEL Compilers

Other flags  are more specific. The most common are reported later for each compiler.


INTEL Compilers

Initialize the environment with the Initialize the environment with the module command:

> module load intel

The names of the Intel compilers are:

  • ifort: Fortran77 and Fortran90 Fortran compiler
  • icc: C compiler
  • icpc: C++ compiler

The documentation can be obtained with the man command after loading the relevant module:

> man ifort
> man icc

> man icpc 

Some miscellanous flags are described in the following:

-extend_source    Extend over the 77 column F77's limit
 
-free / -fixed
Free/Fixed form for Fortran -ip
-ipo Enables interprocedural optimization between
forfiles single-file compilation -ipowhole Enables interprocedural optimization between files - whole program optimisationprogram optimisation

Recommended options (available for all languages):

For users running a program first time

To optimize a program on generally level:

<name_of_compiler>  -O2 source_file

for example for fortran:

> ifort -O2 source_file

For optimizing a program that user confirmed to ran successfully already

You change from -O2 to -O3, your program may speed up,

<name_of_compiler>  -O3 source_file

for example for fortran:

> ifort -O3 source_file

In the case, you must check if a result of your program with -O3 option is accurate.


Compiling for KNL and SKL

Since KNL and SKL nodes are binary compatible with legacy x86 instruction set, any code compiled for BDW Marconi nodes will run on these nodes. However, the specific compiler option is needed to generate AVX-512 instructions to derive better performance from these nodes.

Version 15.0 and newer of the Intel compilers can generate these instructions if you specify for KNL nodes the "-xMIC-AVX512" flag (which generates specific AVX512 instructions, hence the binary will not work on the Broadwell partition) or the -axMIC-AVX512 flag (which generates optimized executables for both AVX2 and AVX512 ISA):the Intel compilers can generate these instructions. For example:

  • -xMIC-AVX512  to generate optimised code for KNL

  • -xCORE-AVX512 to generate optimised code for SKL

  • -axMIC-AVX512  cross platform 2 versions: baseline and KNL

  • -axCORE-AVX512 cross platform 2 versions: baseline and KNL SKL

  • -axMIC-AVX512,CORE-AVX512 three versions: baseline, KNL and  SKL

    "baseline:" governed by implied –x flag, default sse2.

module load intel
icc -axMIC-AVX512 ,CORE-O3AVX512 -o executable source.c
icpc -axMIC-AVX512 ,CORE-O3AVX512 -o executable source.cc
ifort -axMIC-AVX512,CORE-AVX512 -O3 -o executable source.f

Differently for  for SKL nodes you  you have to specify the  the  -"xCORE-AVX512" flag  flag in order to generate AVX-512 instructions. When using this option, Intel compilers default to using AVX512 “low”, i.e., a 256-bit version of AVX512 through AVX512-VL (see also compiler documentation for -qopt-zmm-usage=low).

...

There are certain considerations to be taken into account before running legacy codes on KNL and SKL nodes. Primarily, the effective use of vector instructions is critical to achieve good performance on the cores. For guideline on how to get to get vectorization information and improve code vectorization, refer to

 How to Improve Code Vectorization

...