GPUs are available on the batch farm for both interactive
and batch processing using slurm commands. GPU use from Auger is not supported.
The slurm commands (sbatch, salloc, etc) are available on the ifarm machines.
The following GPU resources are available:
· TitanRTX (up to four per node)
Batch use
Submissions should request one to four GPUs, and should use
the GPU partition. The environment variable CUDA_VISIBLE_DEVICES will be set to
show which GPUs are allocated for your job. Here is an example script that allocates two Nvidia Titan RTX GPU on a single node
and shows the execution environment in the slurm output file.
#!/bin/sh
#slurm sbatch requesting two K20 GPUs.
CUDA_VISIBLE_DEVICES will be set.
#SBATCH --nodes 1
#SBATCH --partition gpu
#SBATCH --cpus-per-task 1
#SBATCH --gres=gpu:TitanRTX:2
#SBATCH --job-name=GPU-test
echo =================
env
echo =================
Interactive Use
For interactive use of GPUS, slurm is used to allocate GPU
resources and avoid conflicts. Here is an example of starting an interactive session on a
GPU machine with a single GPU of any sort:
$ salloc -n 1 -p gpu --gres=gpu:TitanRTX":1
salloc: Granted job allocation 4474492
$ srun --pty bash
bash-4.2$ hostname
qcd12k0202
bash-4.2$ echo $CUDA_VISIBLE_DEVICES
0
$ exit
$exit
salloc: Relinquishing job allocation 4475870