Using GPUs

GPUs provide enormous floating point capacity and memory bandwidth, and can yield up to 10x as much performance (science) per dollar, but do require the use of specialized libraries or specialized programming techniques.

Interactive development

The two LQCD / HPC interactive nodes each have one GPU installed which you may use for interactive software development and testing.  All production GPU nodes reached through the gpu queue have 4 GPUs.  
qcdi -- alias for all interactive nodes
qcdfi -- alias for the interactive node with a Fermi GPU
qcdkmi -- alias for the interactive node with a Kepler GPU and a MIC accelerator

Programming model

The user's application runs on the host, and invokes kernels that execute on the GPUs.  These kernels are typically written in NVNDIA's CUDA language, a C-like language that makes the GPU appear to behave like thousands of cores running in parallel, each working on one piece of data (data parallel). Key LQCD kernels ("level 3 routines") have been written in CUDA and wrapped in C and C++ into a single data parallel library that is easily used by community LQCD codes.  Parallel jobs typically use one MPI process per GPU, although other approaches are also used.

Considerable information is available online for writing custom CUDA routines, and then linking them into C or C++ applications.

Developer Tools

  • gcc 4.6.3 is in /dist/gcc-4.6.3
  • MVAPICH2-1.8 is in /usr/mpi/gcc/mvapich2-1.8
  • OpenMPI-1.6.3 is in /usr/mpi/gcc/openmpi-1.6.3
  • CUDA-5.0 is in /usr/local/cuda-5.0

It is recommended that gcc 4.6.3 is used as it supports the generation of AVX instructions for the host.

Setting up the Environment

The following instructions are for the BASH shell. For TCSH users mileage may vary

To set up gcc-4.6.3 (if needed) add the following to your .bashrc or job-script (This is for the BASH Shell).

  export PATH=/dist/gcc-4.6.3/bin:$PATH
  export LD_LIBRARY_PATH=/dist/gcc-4.6.3/lib64:/dist/gcc-4.6.3/lib:$LD_LIBRARY_PATH
or alternatively using the module system:
module load gcc-4.6.3

To set up mvapich2-1.8 add the following to your .bashrc or job-script

  MPIHOME=/usr/mpi/gcc/mvapich2-1.8
  export PATH=${MPIHOME}/bin:$PATH
  export LD_LIBRARY_PATH=${MPIHOME}/lib:${MPIHOME}/lib64:/usr/lib64:/usr/lib:$LD_LIBRARY_PATH
or alternatively, using the module system:
module load mvapich2-1.8

To set up OpenMPI-1.6.3 add the following to your .bashrc or job-script

 MPIHOME=/usr/mpi/gcc/openmpi-1.6.3
 export PATH=${MPIHOME}/bin:$PATH
 export LD_LIBRARY_PATH=${MPIHOME}/lib:$LD_LIBRARY_PATH
or alternatively using the module system
module load openmpi-1.6.3

To set up CUDA-5.0 add the following to your .bashrc or job-script

  export PATH=/usr/local/cuda-5.0/bin:$PATH
  export LD_LIBRARY_PATH=/usr/local/cuda-5.0/lib64:/usr/local/cuda-5.0/lib:$LD_LIBRARY_PATH