HKUST SuperPOD Software - Module System

Lmod serves as a vital tool for effectively managing installations of various application software within our system. By employing the modules system, users are empowered to configure the shell environment, granting convenient access to applications and simplifying the process of executing and compiling software. In addition, Lmod facilitates the coexistence of multiple versions of the same software, thereby abstracting the complexities associated with versioning and the intricate dependencies on the operating system. This ensures a seamless and efficient software management environment within our system.

Upon logging into the cluster, users are presented with a default environment that offers limited software availability, serving as a basic and minimal foundation. To effectively manage and activate software packages as needed, the module system is employed. In order to utilize the software installed on the HKUST SuperPOD, it is imperative to load the corresponding software module first. By loading a module, the system automatically configures and adjusts the user environment variables accordingly, thereby granting access to the GPU-optimized software package associated with that specific module. For instance, the modification of the $PATH environment variable ensures the availability of necessary executables required by the respective software package. This systematic approach ensures seamless utilization of software resources within the cluster environment.

Module Usage 

Here the list of usage of Lmod command:

Module command

Description

module list

List loaded modules in current environment

module avail

List available software

module apropos Anaconda

Search for particular software, e.g, Anaconda

module whatis apptainer

Display information about a particular module, e.g, apptainer

module load Anaconda

Load a particular module.

module unload Anconda

Unload a particular module, 

module swap cuda11.8/toolkit/11.8.0 cuda12.2/toolkit/12.2.2

Swap modules, e.g, replace cuda11.8 tookit with cuda12.2

module purge

Remove all modules

 

 

Module List

From cluster manager: 

apptainer gcc/13.1.0 mariadb-libs boost/1.81.0 
gcc/64/4.1.7a1 slurm/slurm/23.02.6 cluster-tools/10.0  ipmitool/1.8.19
freeipmi/1.6.10  python39 Anaconda3/2023.09-0 cm-pmix3/3.1.7
cm-pmix4/4.1.3 cuda11.8/blas/11.8.0 cuda11.8/fft/11.8.0 cuda11.8/toolkit/11.8.0
cuda12.2/blas/12.2.2 cuda12.2/fft/12.2.2 cuda12.2/toolkit/12.2.2 gdb/13.1
hdf5_18/1.8.21 hpl/2.3 hwloc/1.11.13 hwloc2/2.8.0
mvapich2/gcc/64/2.3.7 nvhpc-byo-compiler/23.11 nvhpc-hpcx-cuda11/23.11 nvhpc-hpcx-cuda12/23.11
nvhpc-hpcx/23.11 nvhpc-nompi/23.11 nvhpc-openmpi3/23.11 nvhpc/23.11
openblas/dynamic/0.3.18 openmpi/gcc/64/4.1.5 openmpi4/gcc/4.1.5 ucx/1.10.1

 

Examples

Going to load Anaconda python module for Python code development

To check all available modules:

$ module avail
------------------------------------------------ /cm/local/modulefiles -------------------------------------------------
apptainer/apptainer.module  cmd              gcc/13.1.0       mariadb-libs  null      shared
boost/1.81.0                cmjob            gcc/64/4.1.7a1   module-git    openldap  slurm/slurm/23.02.6
cluster-tools/10.0          dot              ipmitool/1.8.19  module-info   python3   use.own
cm-bios-tools               freeipmi/1.6.10  luajit           modules       python39

------------------------------------------------ /cm/shared/modulefiles ------------------------------------------------
Anaconda3/2023.09-0      cuda12.2/fft/12.2.2      hwloc2/2.8.0              nvhpc-openmpi3/23.11
cm-pmix3/3.1.7           cuda12.2/toolkit/12.2.2  mvapich2/gcc/64/2.3.7     nvhpc/23.11
cm-pmix4/4.1.3           default-environment      nvhpc-byo-compiler/23.11  openblas/dynamic/0.3.18
cuda11.8/blas/11.8.0     gdb/13.1                 nvhpc-hpcx-cuda11/23.11   openmpi/gcc/64/4.1.5
cuda11.8/fft/11.8.0      hdf5_18/1.8.21           nvhpc-hpcx-cuda12/23.11   openmpi4/gcc/4.1.5
cuda11.8/toolkit/11.8.0  hpl/2.3                  nvhpc-hpcx/23.11          ucx/1.10.1
cuda12.2/blas/12.2.2     hwloc/1.11.13            nvhpc-nompi/23.11

 

You now could do “module load” and run the code respectively. When no version number is provided, system default is loaded.

$ module load Anaconda3
$ module list
Currently Loaded Modulefiles:
 1) slurm/slurm/23.02.6   2) Anaconda3/2023.09-0

 

If you want specific module to be loaded every time during login do this:

$ module initadd Anaconda3

And you could run the following command to check if the list was updated:

$ module initlist
bash initialization file $HOME/.bashrc loads modules:
        slurm Anaconda3/2023.09-0

To remove module from being load in advance:

$ module initrm Anaconda3