GPU Jobs

GPUs are available on the batch farm for both interactive and batch processing using slurm commands. GPU use from Auger is not supported. The slurm commands are available on the ifarm machines.

 

The following GPU resources are available:

 

·      K20m (up to four per node)

 

 

Batch use

Submissions should request one to four GPUs, and should use the GPU partition. The environment variable CUDA_VISIBLE_DEVICES will be set to show which GPUs are allocated for your job. In this example the account is “ml.” Please submit a ServiceNow request to be added to this account.  Here is an example script that allocates two K20m GPU on a single node and shows the execution environment in the slurm output file.

 

#!/bin/sh

#slurm sbatch requesting two K20 GPUs. CUDA_VISIBLE_DEVICES will be set.

#SBATCH --account=ml

#SBATCH --nodes 1

#SBATCH --partition gpu

#SBATCH --cpus-per-task 1

#SBATCH --gres=gpu:K20m:2

#SBATCH --job-name=GPU-test

 

echo =================

env

echo =================

 

 

Interactive Use

 

For interactive use of GPUS, slurm is used to allocate GPU resources and avoid conflicts. The interactive partition name is igpu, meaning “interactive GPU”. Here is an example of starting an interactive session on a GPU machine with a single K20m:

 

$ salloc -n 1 -p igpu --gres=gpu:K20m:1 –account ml

salloc: Granted job allocation 4474492

$ srun --pty bash

bash-4.2$ hostname

qcd12k0202

bash-4.2$ echo $CUDA_VISIBLE_DEVICES

0

$ exit

$exit

salloc: Relinquishing job allocation 4475870