Available software components are easily configured using the Lua lmod system which modifies the PATH
and LD_LIBRARY_PATH
(bash) shell environment variables and sets any other needed variables. Lmod also guards against a user having conflicting software packages in your shell environment. More information on using lmod is available in the Introduction to lmod.
You can list all of the software components available using the avail
option. Assuming you do not have anything already configured with lmod
, the avail
option will display mainly the available compilers and other utilities that do not depend upon a particular compiler. Here is an (edited) example of the listing you will see:
[@qcdi1401 ~]$ module avail
----------------------/etc/modulefiles----------------------
anaconda ansys cmake-3.13.4 gcc-4.8.2
anaconda2 ansys18 computecpp gcc_4.9.2
anaconda2-latest ansys2020r1 curl-59 gcc-4.9.2
[output suppressed]
The load
command will enable a software package within your shell environment. If there is more than one package version available, use the package name and version you want to load, e.g. mvapich2-2.3a. The following loads gcc-7.2.0 and mvapich2-2.3a:
[@qcdi1401 ~]$ module load gcc-7.2.0 mvapich2-2.3a
Currently loaded modules can be displayed with list
:
[@qcdi1401 ~]$ module list
Currently Loaded Modulefiles:
1) gcc-7.2.0 2) mvapich2-2.3a$ module list
If a package is no longer needed, it can be unloaded:
[@qcdi1401 ~]$ module unload gcc-7.2.0
[@qcdi1401 ~]$ module list
Currently Loaded Modulefiles:
1) mvapich2-2.3a
You can use swap
to change to another MPI implementation:
[@qcdi1401 ~]$ module swap mvapich2-2.3a mvapich2-1.8
[@qcdi1401 ~]$ module list
Currently Loaded Modulefiles:
1) mvapich2-1.8
The purge
command will unload all current modules:
[@qcdi1401 ~]$ module purge
[@qcdi1401 ~]$ module list
No Modulefiles Currently Loaded
This is useful at the beginning of batch scripts to prevent the batch shell from unintentionally inheriting a module environment from the submission shell.
In order to compile your code using CUDA, on the login nodes (qcdi1401 or qcdi1402) check available CUDA versions as follows:
[@qcdi1402 ~]$ module use /dist/modulefiles/
[@qcdi1402 ~]$ module avail
----------------------------------------------------------- /dist/modulefiles/ -----------------------------------------------------------
anaconda2/4.4.0 anaconda3/5.2.0 cmake/3.21.1 curl/7.59 gcc/7.1.0 gcc/8.4.0 go/1.15.4
anaconda2/5.2.0 cmake/3.17.5 cuda/10.0 gcc/10.2.0 gcc/7.2.0 gcc/9.3.0 singularity/2.3.1
anaconda3/4.4.0 cmake/3.18.4 cuda/9.0 gcc/5.3.0 gcc/7.5.0 go/1.13.5 singularity/3.6.4
------------------------------------------------------------ /etc/modulefiles ------------------------------------------------------------
anaconda ansys18 gcc_4.6.3 gcc-4.9.2 gcc-6.2.0 gsl-1.15 mvapich2-1.8
anaconda2 ansys2020r1 gcc-4.6.3 gcc_5.2.0 gcc-6.3.0 hdf5-1.8.12 mvapich2-2.1
......
Load the desired CUDA version as follows:
[@qcdi1402 ~]$ module load cuda/10.0
Near the beginning of your batch script, prior to launching an MPI process you should ensure only software modules required by the batch script have been loaded, for example if using gcc-7.2.0 and mvapich2-2.3a:
[@qcdi1401 ~]$ module purge
[@qcdi1401 ~]$ module load gcc-7.2.0 mvapich2-2.3a
There are two common mechanisms for starting up MPI either use mpirun
or srun
. An mpirun
command is provided by each MPI implementation, and is specific to that implementation. The commands provided by openmpi
, mvapich
, and Intel impi
will attempt to use slurm interfaces to distribute and start MPI binaries. In addition, slurm has the srun
command that is able to start up MPI. The following command (run on a 16p worker node) will list APIs srun
supports for MPI.
$ srun --mpi=list
srun: MPI types are...
srun: pmix_v1
srun: pmix
srun: none
srun: openmpi
srun: pmi2
If you need support or have questions please use the following support web page.