Software

Thu, 12/02/2021 - 10:08 — amitoj

1. Installed software components

Available software components are easily configured using the Lua lmod system which modifies the PATH and LD_LIBRARY_PATH (bash) shell environment variables and sets any other needed variables. Lmod also guards against a user having conflicting software packages in your shell environment. More information on using lmod is available in the Introduction to lmod.

2. Quick Introduction to using modules

You can list all of the software components available using the avail option. Assuming you do not have anything already configured with lmod, the avail option will display mainly the available compilers and other utilities that do not depend upon a particular compiler. Here is an (edited) example of the listing you will see:

[@qcdi1401 ~]$ module avail

----------------------/etc/modulefiles----------------------
anaconda           ansys              cmake-3.13.4       gcc-4.8.2
anaconda2          ansys18            computecpp         gcc_4.9.2
anaconda2-latest   ansys2020r1        curl-59            gcc-4.9.2

[output suppressed]

The load command will enable a software package within your shell environment. If there is more than one package version available, use the package name and version you want to load, e.g. mvapich2-2.3a. The following loads gcc-7.2.0 and mvapich2-2.3a:

[@qcdi1401 ~]$ module load gcc-7.2.0 mvapich2-2.3a

Currently loaded modules can be displayed with list:

[@qcdi1401 ~]$ module list
Currently Loaded Modulefiles:
1) gcc-7.2.0 2) mvapich2-2.3a$ module list

If a package is no longer needed, it can be unloaded:

[@qcdi1401 ~]$ module unload gcc-7.2.0
[@qcdi1401 ~]$ module list
Currently Loaded Modulefiles:
1) mvapich2-2.3a

You can use swap to change to another MPI implementation:

[@qcdi1401 ~]$ module swap mvapich2-2.3a mvapich2-1.8
[@qcdi1401 ~]$ module list
Currently Loaded Modulefiles:
1) mvapich2-1.8

The purge command will unload all current modules:

[@qcdi1401 ~]$ module purge
[@qcdi1401 ~]$ module list
No Modulefiles Currently Loaded

This is useful at the beginning of batch scripts to prevent the batch shell from unintentionally inheriting a module environment from the submission shell.

3. Accessing NVIDIA CUDA using modules

In order to compile your code using CUDA, on the login nodes (qcdi1401 or qcdi1402) check available CUDA versions as follows:

[@qcdi1402 ~]$ module use /dist/modulefiles/
[@qcdi1402 ~]$ module avail

----------------------------------------------------------- /dist/modulefiles/ -----------------------------------------------------------
anaconda2/4.4.0   anaconda3/5.2.0   cmake/3.21.1      curl/7.59         gcc/7.1.0         gcc/8.4.0         go/1.15.4
anaconda2/5.2.0   cmake/3.17.5      cuda/10.0         gcc/10.2.0        gcc/7.2.0         gcc/9.3.0         singularity/2.3.1
anaconda3/4.4.0   cmake/3.18.4      cuda/9.0          gcc/5.3.0         gcc/7.5.0         go/1.13.5         singularity/3.6.4

------------------------------------------------------------ /etc/modulefiles ------------------------------------------------------------
anaconda ansys18 gcc_4.6.3 gcc-4.9.2 gcc-6.2.0 gsl-1.15 mvapich2-1.8
anaconda2 ansys2020r1 gcc-4.6.3 gcc_5.2.0 gcc-6.3.0 hdf5-1.8.12 mvapich2-2.1
......

Load the desired CUDA version as follows:

[@qcdi1402 ~]$ module load cuda/10.0

4. Launching MPI processes

Near the beginning of your batch script, prior to launching an MPI process you should ensure only software modules required by the batch script have been loaded, for example if using gcc-7.2.0 and mvapich2-2.3a:

[@qcdi1401 ~]$ module purge
[@qcdi1401 ~]$ module load gcc-7.2.0 mvapich2-2.3a

There are two common mechanisms for starting up MPI either use mpirun or srun. An mpirun command is provided by each MPI implementation, and is specific to that implementation. The commands provided by openmpi, mvapich, and Intel impi will attempt to use slurm interfaces to distribute and start MPI binaries. In addition, slurm has the srun command that is able to start up MPI. The following command (run on a 16p worker node) will list APIs srun supports for MPI.

$ srun --mpi=list
srun: MPI types are...
srun: pmix_v1
srun: pmix
srun: none
srun: openmpi
srun: pmi2

5. Cluster based user documentation

Use the menu at the bottom to pick documentation related to each type of cluster (KNL, GPU). Examples of how to use the system are contained in several of the chapters, and additional useful tips are contained in the Frequently Asked Questions chapter at the end.

If you need support or have questions please use the following support web page.

Main menu

Navigation

You are here

1. Installed software components

2. Quick Introduction to using modules

3. Accessing NVIDIA CUDA using modules

4. Launching MPI processes

5. Cluster based user documentation

Main menu

Navigation

User login

You are here

Software

1. Installed software components

2. Quick Introduction to using modules

3. Accessing NVIDIA CUDA using modules

4. Launching MPI processes

5. Cluster based user documentation