Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

-gBuild application with debug information to allow binary-to-source correlation in the reports.
-qopenmpEnable generation of multi-threaded code if OpenMP directives/pragmas exist.
-O2 (or higher)Request compiler optimization.
-vecEnable vectorization if option O2 or higher is in effect (enabled by default).
-simdEnable SIMD directives/pragmas (enabled by default).

For details of these options refer to man page or documentations of Intel compilers.

 

 

 

 


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Explicit Vectorization

Compiler SIMD directives/pragmas

Users can add compiler SIMD directives/pragmas to the source code to tell the compiler that dependency does not exist, so that the compiler can vectorize the loop when the user re-compiles the modified source code. Such SIMD directives/pragmas include:

 

#pragma vector always: instruct to vectorize a loop if it is safe to do so
#pragma vector align: assert that data within the loop is aligned on 16B boundary
#pragma ivdep: instruct the compiler to ignore potential data dependencies
#pragma simd: enforce vectorization of a loop

 

OpenMP directives/pragmas

Users can use OpenMP 4.0 new directives/pragmas for explicit vectorization:

 

#pragma omp simd: enforce vectorization of a loop
#pragma omp declare simd: instruct the compiler to vectorize a function
#pragma omp parallel for simd: target same loop for threading and SIMD, with each thread executing SIMD instructions

 

Compiler options and macros

Users can also use compiler options and macros for explicit vectorizaiton:


 

-D NOALIAS/-noalias: assert that there is no aliasing of memory references (array addresses or pointers)
-D REDUCTION: apply an omp simd directive with a reduction clause
-D NOFUNCCALL: remove the function and inline the loop
-D ALIGNED/-align: assert that data is aligned on 16B boundary
-fargument-noalias: function arguments cannot alias each other

 

SIMD enabled functions

Users can also declare and use SIMD enabled functions. In the example below, function foo is declared as a SIMD enabled function (vector function), so it is vectorized. So is the for loop in which it is called.

 

__attribute(vector)
float foo(float);
void vfoo(float *restrict a, float *restrict b, int n){
    int i;
    for (i=0; i<n; i++) { a[i] = foo(b[i]); }
}
float foo(float x) { ... }

 

Programming Guidelines for Writing Vectorizable Code

  • Use simple loops, avoid variant upper iteration limit and data-dependent loop exit conditions
  • Write straight-line code: avoid branches, most function calls or if constructs
  • Use array notations instead of pointers
  • Use unit stride (increment 1 for each iteration) in inner loops
  • Use aligned data layout (memory addresses)
  • Use structure of arrays instead of arrays of structures
  • Use only assignment statements in the innermost loops
  • Avoid data dependencies between loop iterations, such as read-after-write, write-after-read, write-after-write
  • Avoid indirect addressing
  • Avoid mixing vectorizable types in the same loop
  • Avoid functions calls in innermost loop, except math library calls