KNL Compilers and Tools
The Knight's Landing CPUs are true x86 CPUs, supporting SSE, AVX, AVX2 instructions as well as AVX-512. Correspondingly code compiled for more or less any x86 system with say the GNU compiler collection will function. However, in order to generate code which takes full advantage of AVX-512 a recent compiler is needed, such as GCC or the Intel Compiler Toolchain.
Since the cores on the KNL nodes themselves are weak individually compared to regular Xeon (e.g. Haswell or Sandy Bridge) cores, it is most efficient to use the interactive nodes for compilation. Currently AVX-512 compilation can only take place with the Intel Toolchain. Latest versions of GCC can also generate AVX512.
Intel Parallel Studio
The Intel Parallel Studio XE product is available which includes Intel Composer XE C/C++ and Fortran Compilers. Additionally various libraries (MKL, Threaded Building Blocks) and tools (Intel Advisor, VTune Amplifier XE, Intel MPI, Intel Trace Analyzer and Collector) are included. We discuss these in various sections. We focus here on the Intel C/C++ compiler.
Setting up the Compilers and MPI
The Intel Parallel Studio XE product is installed in:
/dist/intel/parallel_studio_xe
It comes with set-up scripts that modify user paths to set up the various tools. The single setup script psxevars.sh (for Bash) and psxevars.csh (for CSH/TCSH) sets up the product. The scripts should be utilized as follows (Bash version):
source /dist/intel/parallel_studio_xe/parallel_studio_xe/psxevars.sh intel64
For CSH/TCSH one should source the .csh version. The source-ing makes the compilers available for use. The commands are:
icc
the Intel C Compiler
icpc
the Intel C++ Compiler
ifort
the Intel Fortran Compiler
mpiicc
the MPI Wrapper for the Intel C Compiler
mpiicpc
the MPI Wrapper for the Intel C++ Compiler
mpiifort
the MPI Wrapper for the Intel Fortran Compiler
Note: The MPI Wrappers utilize Intel-MPI. There may be caveats when using other MPI Implementations.
Useful Intel C/C++ compiler options
-O3
Select a high level of optimization
-qopenmp
Enable OpenMP support
-xMIC-AVX512
Enable the generation of AVX-512 code
-qopt-report=level -qopt-report-phase=vec
Generate optimization and vectorization reports. See this note for details on using these options.
C++ and OpenMP Shared Libraries
The intel C/C++ compiler utilized the header files and possibly the C++ Standard library of the installed GCC compilers. Additionally, OpenMP support is provided using shared libraries (libiomp.so.5). To ensure the compiled C++ codes run we recommend to add the locations of these libraries to the shared library path LD_LIBRARY_PATH.
To do this one should do the following prior to running:
- source psxevars.sh intel64 as described before.
- add other directories onto the shared library path
- export LD_LIBRARY_PATH=/dist/<gcc-version>/lib64:/dist/gcc-<version>/lib:/usr/lib64:/usr/lib:$LD_LIBRARY_PATH