Compiling

The advantage of using Intel Phi accelerator is that applications do not need to change in order to use accelerators. Intel compilers will compile the same code either for CPUs or for accelerators according to one compiler flag: -mmic for accelerator, -xhost for host CPUs.

The Intel Compiler Suite

The Intel C++ Composer XE 2013 suite is in one's path once the correct environment is up. The GNU GCC Compiler Collection, version 4.4.6 is available for utilities and serial host applications, but cannot be used to build applications with MPI, MIC offloading or native MIC applications. We recommend using the Intel compilers whenever possible. The Intel suite has been installed with 64-bit standard libraries and compiles programs as 64-bit applications (as the default compiler mode). Since the E5's and Phi coprocessors are new architectures, which rely on optimizations in the new 2013 compiler, any program compiled for another Intel system should to be recompiled.

The Intel C/C++ compiler commands are "icc" and "icpc", respectively. Use the "-help" option with any of these commands to display a list and explanation of all the compiler options, useful during debugging and optimization. Please check out Intel C/C++ compilers for additional information.

Basic Compiler Commands and Serial Program Compiling

Compiling serial programs

Compiler Language File Extension Example
icc C .c icc [compiler_options] prog.c
icpc C++ .C, .cc, .cpp, .cxx icpc [compiler_options] prog.cpp


Appropriate file name extensions are required for each compiler. By default, the executable name is "a.out", but it may be renamed with the "-o" option. We use "a.out" throughout this guide to designate a generic executable file. The compiler command performs two operations: it makes a compiled object file (having a .o suffix) for each file listed on the command line, and then combines them with system library files in a link step to create an executable. To compile without the link step, use the "-c" option.

The same code can be compiled to run either natively on the host or natively on the MIC. Use the same compiler commands for the host (E5) or the MIC (Phi) compiling, but include the "-mmic" option to create a MIC executable. We suggest you name MIC executables with a ".mic" suffix.

Host, MIC and Host+MIC offload compilations

Mode Required Options Notes
Native (HOST) none Use -xhost to generate AVX (Advanced Vector Extensions) instructions.
Native Phi (MIC) -mmic Suggestion: name executables with a ".mic" suffix to differentiate them from a host executable.
Host + Offload none for automatic offloading of MKL lib functions use environment variables for direct offloading use pragmas

The following examples illustrate how to rename an executable (-o option), compile for the host (run on the CPUs), and compile for the MIC (run natively on the MIC):

A C program example:

 login1$ icc   -xhost -O2 -o flamec     prog.c
 login1$ icc   -mmic  -O2 -o flamec.mic prog.c


For additional information, execute the compiler command with the "-help" option to display every compiler option, its syntax, and a brief explanation, or display the corresponding man page, as follows:

 login1$ icc   -help
 login1$ icpc  -help
 login1$ ifort -help
 login1$ man icc
 login1$ man icpc
 login1$ man ifort

To find out more about developing for Phi accelerators, please check out Programming and Compiling for Intel® Many Integrated Core Architecture.

Enable Automatic Vectorization

One of the keys to the performance value of Intel Xeon Phi coprocessors is the 512-bit registers and associated SIMD operations. The compiler vectorizer detects operations in the program that can be done in parallel and converts the sequential operations to parallel; for example, the vectorizer converts the sequential SIMD instruction that processes 2, 4, 8 or up to 16 elements into a parallel operation, depending on the data type. Using the -vec option enables vectorization at default optimization levels for both Intel Xeon host CPUs or Intel Xeon Phi accelerators. There are ways to take the advantages of auto-vectorization by writing code using techniques such as simple for loops, straight-line code (no branching), avoid loop dependency, and correct data alignment. For more information about these techniques, please read Getting Started Tutorial: Using Auto Vectorization . One can also check out how well the auto-vectorization is done by using a compiler flag -vec-report[0|1|2|3|4|5].

Compiling OpenMP programs

On Phi accelerators, applications have to use OpenMP to take advantage of multi-core shared-parallelism. For applications with OpenMP parallel directives, include the -openmp option on the compiler command to enable the parallel thread generation. Use the -openmp_report option to display diagnostic information.

Important OpenMP compiler options.

Compiler Options(OpenMP) Description
-openmp Enables the parallelizer to generate multi-threaded code based on the OpenMP directives. Use whenever OpenMP pragmas are present in core for E5 processor or Phi coprocessor.
-openmp_report[0, 1, or 2] Controls the OpenMP parallelizer diagnostic level

Below are host and MIC compile examples for enabling OpenMP code directives.

 login1$ icc   -xhost -openmp -O2 -o flamec     prog.c
 login1$ icc   -mmic  -openmp -O2 -o flamec.mic  prog.c

For more information about OpenMP programming using Intel compilers, please check out Parallelization Using OpenM from Intel.

Compiling MPI programs

On the Phi cluster, the proprietary Intel® MPI Library 4.1 is currently the only option to compile and run MPI programs. At login, Intel MPI environment has been set up if one use the suggested environment set up decribed in the previous sections. Before compiling an MPI program, make sure mpiicc and mpiicpc are in your path.

Here are the examples of compiling MPI programs for host and for accelerators

  login1$ mpiicc  -xhost -O2 -o simulate      mpi_prog.c
  login1$ mpiicc  -mmic  -O2 -o simulate.mic  mpi_prog.c
  login1$ mpiicpc  -xhost -O2 -o simulate1     mpi_prog1.cpp
  login1$ mpiicpc  -mmic  -O2 -o simulate1.mic  mpi_prog1.cpp

Please check out Using the Intel® MPI Library on Intel® Xeon Phi™ Coprocessor Systems for more information.

Optimization

Please read Intel documents about Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors.