You are here

Running Knights Landing Jobs

Setting up Shared Libraries

Ensure that shared libraries for OpenMP, MPI, C/C++ etc are set up as described in the Compilers and Tools section

Affinity, Binding and MPI Pinning

Appropriate thread affinity, and MPI process pinning is important for high performance. These factors are often intertwined. We will consider first OpenMP thread affinity (as if we were running a non-MPI job) and then consider MPI.

OpenMP Thread Affinity

There are two sets of environment variables controlling OpenMP. One set are Intel specific and the others are part of the OpenMP 4.0 standard. For further reference please see Fermilab pages on Intel OpenMP. Common useful variables are

The number of OpenMP threads


bind threads to X cores with Y threads per core


a very general option,see below

In particular KMP_AFFINITY is a general option with several parts, which may be combined. Useful options are

Display thread assignments at startup


Compact Thread IDs (threads run fastest amongst SIMT threads, slowest amongst cores and sockets)


Scatter the thread IDs (threads run fastest among sockets (SNC-4 mode), cores and slowest within cores


Treat available hyperthreads as the finest level of granularity when binding

The KMP_AFFINITY can be used in conjunction with KMP_PLACE_THREADS for example to enable a compact ordering, but with only 2 threads per core (bash shell)

   # 64 cores and 128 threads so 2 threads per core  

   export KMP_PLACE_THREADS=1s,64c,2t     

   # compact ordering within the core, reporting thread bindings  

   export KMP_AFFINITY=verbose,compact,granularity=thread  

   # set number of threads  

   export OMP_NUM_THREADS=128  

   # run the job