Using MCDRAM High Bandwidth Memory on the Knights Landing Nodes

Use Cache Mode

The simplest way to exploit MCDRAM is to use it as a cache. This requires the nodes to be booted into Cache mode, or jobs to be submitted to jobs with the tag cache-quad. No code changes are needed for this mode of operation, and it is probably the best place to start for applications which do not fit entirely into the 16GB of MCDRAM per node

Use numactl to force allocations to MCDRAM

If the application and its dynamic working set fit entirely into MCDRAM, one can force the application to use perform memory allocations from MCDRAM. This can be accomplished using the numactl utility. In the flat-quadrant cluster modes DDR memory is in NUMA domain 0, and and MCDRAM is NUMA domain 1. Hence

numactl -m 1 ./executable
will force a process to use MCDRAM.

Caveats with this approach, are that if MCDRAM is full, the job will fail. Also this will force allocations into MCDRAM, but I am not certain at to the location of automatic/stack memory.

Use memkind and hbwmalloc

Memkind and HBW_Malloc are memory allocators based on JE Malloc. Calling these libraries can explicitly allocate memory in the high bandwidth MCDRAM region. The memkind library and hbw_malloc are available through the Intel Compiler toolchain. For more information see for example: the Colfax KNL MCDRAM tutorial.