Skip to content

Running Jobs on Deceema

Running Jobs

Jobs on Deceema are under the control of the Slurm scheduling system. The scheduling system is configured to offer an equitable distribution of resources over time to all users. The key means by which this is achieved are:

  • Jobs are scheduled according to the resources that are requested.

  • Jobs are not necessarily run in the order in which they are submitted.

  • Jobs requiring a large number of cores and/or long walltime will have to queue until the requested resources become available. The system will run smaller jobs, that can fit in available gaps, until all of the resources that have been requested for the larger job become available - this is known as backfill. Hence it is beneficial to specify a realistic walltime for a job so it can be fitted in the gaps.

Slurm Jobs

Here we give a quick introduction to Slurm commands. Those requiring more fine grain control should consult the relevant documentation. Jobs move through a (simplified!) lifecycle as follows;

Submitting a Job

The command to submit a job is sbatch. For example, to submit the set of commands contained in the file myscript.sh, use the command:

    sbatch myscript.sh

The system will return a job number, for example:

    Submitted batch job 55260

Slurm is aware of your current working directory when submitting the job so there is no need to manually specify it in the script.

Upon completion of the job, there will be two output files in the directory from which you submitted the job. These files, for job id 55260, are:

  • slurm-55260.out - standard out and standard error output
  • slurm-55260.stats - information about the job from Slurm

Cancelling a Job

To cancel a queued or running job use the scancel command and supply it with the job ID that is to be cancelled. For example, to cancel the previous job:

    scancel 55260

Monitoring Your Jobs

There are a number of ways to monitor the current status of your job. You can view what’s going on by issuing any one of the following commands:

  • squeue is Slurm’s command for viewing the status of your jobs. This shows information such as the job’s ID and name, the QOS used (the “partition”, which will tell you the node type), the user that submitted the job, time elapsed and the number of nodes being used.

  • scontrol is a powerful interface that provides an advanced amount of detail regarding the status of your job. The show command within scontrol can be used to view details regarding a specific job.

For example
    squeue
    squeue -j 55620
    scontrol show job 55620

Associate Jobs with Projects and QoS

Every job has to be associated with a project to ensure the equitable distribution of resources. Project owners and members will have been issued a project code for each registered project, and only usernames authorised by the project owner will be able to run jobs using that project code. Additionally, every job has to be associated with a QoS.

You can see what projects you are a member of, and what QoS are available to you, by running the command:

    my_deceema

If you are registered on more than one project then it should be specified using the --account option followed by the project code. For example, if your project is project-name then add the following line to your job script:

    #SBATCH --account=_projectname_

You can specify using the --qos option followed by the QoS name. For example, if the QoS is qos-name then add the following line to your job script:

    #SBATCH --qos=_qosname_

Array Jobs

Array jobs are an efficient way of submitting many similar jobs that perform the same work using the same script but on different data. Sub-jobs are the jobs created by an array job and are identified by an array job ID and an index. For example, if 55620_1 is an identifier, the number 55620 is a job array ID, and 1 is the sub-job.

Example Array Job
    #!/bin/bash
    #SBATCH --account=_projectname_
    #SBATCH --qos=_qosname_
    #SBATCH --time=5:0
    #SBATCH --array=2-5%2

    set -e
    module purge
    module load deceema
    echo "${SLURM_JOB_ID}: Job ${SLURM_ARRAY_TASK_ID} of $      {SLURM_ARRAY_TASK_MAX} in the array"

In Slurm, there are different environment variables that can be used to dynamically keep track of these identifiers.

  • #SBATCH --array=2-5%2 tells Slurm that this job is an array job and that it should run 4 sub jobs (with IDs 2, 3, 4, 5). You can specify up to 4,096 array tasks in a single job (e.g. --array 1-4096). The % separator indicates the maximum number of jobs able to run simultaneously. In this case only 2 jobs at a time will run.

  • SLURM_ARRAY_TASK_COUNT will be set to the number of tasks in the job array, so in the example this will be 4

  • SLURM_ARRAY_TASK_ID will be set to the job array index value, so in the example there will be 4 sub-jobs, each with a different value (from 2 to 5).

  • SLURM_ARRAY_TASK_MIN will be set to the lowest job array index value, so in the example this will be 2.

  • SLURM_ARRAY_TASK_MAX will be set to the highest job array index value, son in the example this will be 5.

  • SLURM_ARRAY_JOB_ID will be set to the job ID provided by running the sbatch command.

Visit the Job Array Support section of the Slurm documentation for more details on how to carry out an array job.

Requesting GPUs

There are many methods for requesting GPUs for your job.

CUDA_VISIBLE_DEVICES and Slurm Jobs

In a Slurm job, CUDA_VISIBLE_DEVICES will index from 0 no matter which GPUs you have been allocated on that node. So, if you request 2 GPUs on a node then you will always see CUDA_VISIBLE_DEVICES=0,1 and these will be mapped to the GPUs allocated to your job.

Available GPUs

We have provided a helper script, called hypestatus that provides information on current GPU availability on Deceema

    $ hypestatus
    Current Deceema GPU availability:
    * 1 node with 1 x A100-40 available
    * 2 nodes with 4 x A100-80 available

The information listed is current when it is run. These GPUs may be allocated to a job shortly after the command is run.

GPU Type

Deceema has both A100-40GB and A100-80GB GPUs available. To request a specific GPU type for a job you should add a constraint to your job submission script:

    #SBATCH --constraint=_feature_

where _feature_ is:

  • a100_40 for the A100-40GB GPU nodes

  • a100_80 for the A100-80GB GPU nodes

If a job does not specify a GPU type, then the system will select the most appropriate. This means that a job may span GPU types.

GPU Type

If your job requires all GPUs to have the same amount of memory, either A100-40s or A100-80s then you must specify the appropriate feature.

Multi-GPU, Multi-Task, or Multi-Node Jobs

In the examples below we will only show the SBATCH options related to requesting GPUs, tasks, and nodes. In each case we report output showing GPU (PCI Bus address) and process mapping (using gpus_for_tasks.cpp from NERSC); and the value of CUDA_VISIBLE_DEVICES for each task. This was done using the following, with the relevant Slurm headers added in the blank line:

    #!/bin/bash

    module purge
    module load deceema
    module load fosscuda/2020b
    g++ -o gpus -lmpi -lcuda -lcudart gpus_for_tasks.cpp
    srun env|grep CUDA_VISIBLE_DEVICES
    srun ./gpus

GPU Visibility to Tasks

By default, each task on a node will see all the GPUs allocated on that node to your job.

Further GPU information is available by adding srun nvidia-smi -L to the above script.

All these examples use srun to launch the individual processes. The behaviour of mpirun is different and you should confirm it works as you expect.

Single GPU, Single Task, Single Node

    #SBATCH --gpus-per-task 1
    #SBATCH --tasks-per-node 1
    #SBATCH --nodes 1
    Rank 0 out of 1 processes: I see 1 GPU(s).
    0 for rank 0: 0000:31:00.0
    CUDA_VISIBLE_DEVICES=0

Multi GPU, Single Task, Single Node

    #SBATCH --gpus-per-task 3
    #SBATCH --tasks-per-node 1
    #SBATCH --nodes 1
    Rank 0 out of 1 processes: I see 3 GPU(s).
    0 for rank 0: 0000:31:00.0
    1 for rank 0: 0000:4B:00.0
    2 for rank 0: 0000:CA:00.0
    CUDA_VISIBLE_DEVICES=0,1,2

Multi GPU, Multi Task, Single Node

    #SBATCH --gpus-per-task 1
    #SBATCH --tasks-per-node 2
    #SBATCH --nodes 1
    Rank 0 out of 2 processes: I see 1 GPU(s).
    0 for rank 0: 0000:31:00.0
    Rank 1 out of 2 processes: I see 1 GPU(s).
    1 for rank 1: 0000:4B:00.0
    CUDA_VISIBLE_DEVICES=0,1,2

Single GPU, Multi Task, Single Node

    #SBATCH --gpus-per-task 1
    #SBATCH --tasks-per-node 2
    #SBATCH --nodes 1
    Rank 0 out of 2 processes: I see 1 GPU(s).
    0 for rank 0: 0000:31:00.0
    Rank 1 out of 2 processes: I see 1 GPU(s).
    1 for rank 1: 0000:4B:00.0
    CUDA_VISIBLE_DEVICES=0
    CUDA_VISIBLE_DEVICES=0

Multi GPU, Multi Task, Single Node

    #SBATCH --gpus-per-task 2
    #SBATCH --tasks-per-node 2
    #SBATCH --nodes 1
    Rank 0 out of 2 processes: I see 2 GPU(s).
    0 for rank 0: 0000:31:00.0
    1 for rank 0: 0000:4B:00.0
    Rank 1 out of 2 processes: I see 2 GPU(s).
    2 for rank 1: 0000:CA:00.0
    3 for rank 1: 0000:E3:00.0
    CUDA_VISIBLE_DEVICES=0,1
    CUDA_VISIBLE_DEVICES=0,1

The --gpu-bind option can be used to restrict the visibility of GPUs to the tasks

Multi GPU, Single Task, Multi Node with GPU/Task Binding

    #SBATCH --gpus-per-task 2
    #SBATCH --gpu-bind=map_gpu:0,1,2,3
    #SBATCH --tasks-per-node 1
    #SBATCH --nodes 2
    Rank 0 out of 2 processes: I see 2 GPU(s).
    0 for rank 0: 0000:31:00.0
    1 for rank 0: 0000:4B:00.0
    Rank 1 out of 2 processes: I see 2 GPU(s).
    0 for rank 1: 0000:31:00.0
    1 for rank 1: 0000:4B:00.0
    CUDA_VISIBLE_DEVICES=0,1
    CUDA_VISIBLE_DEVICES=0,1

Multiple GPUs, Multiple Tasks Per Node, and --gpu-bind

Requesting multiple GPUs per task, multiple tasks per node, and using --gpu-bind is not supported by Slurm. Instead you will need to programmatically map the correct GPUs to tasks.

Job and Resource Limits

When submitting a job there are some limits imposed on what you can request:

  1. The maximum duration you can request is 10 days.

  2. You are limited to 8 nodes (32 GPUs) for a single job.

If you submit a request that exceeds these constraints the job will be rejected immediately upon submission. Please contact us if you would like to run jobs of this kind.

Interactive Jobs

For some tasks, such as debugging a job or self-installing software, you may require an interactive job on a Deceema compute node. This is done using the srun command with the --pty (pseudo-terminal) option.

    srun --account _projectname_ --qos _qos_ --gpus _count_ --time 5 --export=USER,HOME,PATH,TERM --pty /bin/bash

srun options

  • --pty /bin/bash: this requests a bash shell on the compute node. The --pty option must be given at the end of the command

  • --export=USER,HOME,PATH,TERM: this exports a required subset of environment variables

  • --time 5: request a 5-minute job. (You will almost certainly want to change this!)

  • --qos _qos_: specifies the QoS for the job

  • --gpus _count_: specifies the number of GPUs for the job

  • --account _projectname_: optional defines the project account under which you want to run the job

  • --x11: optional allows X11 forwarding for GUI applications

The resource options for an interactive srun command are the same as when assigning resources in a job script and further information can also be found within the following Slurm documentation https://slurm.schedmd.com/srun.html.

Once you’ve executed the command and the underlying Slurm job has been allocated you will see your prompt’s hostname change to one of the Deceema compute nodes. When you are connected to the compute node you should reset your environment to ensure you have not inherited settings from a login node. To do this run:

    module purge; module load deceema
If you have added anything to your .bashrc file you should also run the command:

    source ~/.bashrc

When you disconnect from the interactive job, by exiting the terminal, then the job will automatically be cancelled. If you use an interactive job then it is your responsibility to make sure that you make good use of the resources.

Idle Jobs

Please don’t leave the job idle, and remember to disconnect as soon as you’ve finished your work.

Available Applications

The Deceema Apps provides details on the applications available on Deceema.

Applications Available on Login Nodes

Applications are available on the login nodes for simple tasks such as compiling your code. We may kill, without warning, any long-running or CPU intensive process running on a login node.

Environments

There is one module environment available to all users on Deceema, called hype-apps/live. This is accessed with module load hype-apps/live, which is done by default. Applications in this stack will only be removed or altered with warning to those who have recently used the application. All applications in the live environment.

Short-lived module environments

We may create short-lived module environments for specific testing or events. Modules provided through these environments may be experimental and may be removed or altered without warning.

To reset to the default environment you should run:

    module purge
    module load deceema

These should be the first commands in your jobs scripts and the first commands you run when entering an interactive job.

Toolchains

Applications on Deceema are built on top of toolchains, which are a default set of tools and libraries that are used as the basis for all the applications we install. The toolchains on Deceema follow those defined by EasyBuild and this enables us to build an application stack where the majority of the modules are compatible.

Each of the toolchains is built on top of sub-toolchains, which consist of some parts of the full toolchain.

The toolchains available on Deceema are:

  • foss: GNU compilers (gcc, g++, gfortran), OpenBLAS, FFTW, ScaLAPACK, OpenMPI

    • sub-toolchains: gompi, GCC, GCCcore

    • CUDA is loaded as a separate module by software that can use GPUs

  • nvofbf: NVHPC, OpenBLAS, FFTW, ScaLAPACK, OpenMPI

  • fosscuda: GNU compilers (gcc, g++, gfortran), OpenBLAS, FFTW, ScaLAPACK, OpenMPI, and CUDA

    • sub-toolchains: gompic, GCC, GCCcore
  • intel: Intel oneAPI - compilers (icc, icpc, ifort), MKL and MPI

    • sub-toolchains: iimpi, intel-compilers, GCCcore

We also provide the NVIDIA HPC SDK (NVHPC), but we do not have a toolchain built on top of this.

Available Toolchains

Toolchains Compilers Maths Libraries MPI CUDA
nvofbf/2022.07 NVHPC 2022.07 OpenBLAS 0.3.20, FFTW 3.3.10, ScaLAPACK 2.2.0 OpenMPI 4.1.4 11.7.0
foss/2022a GCC 11.3.0 OpenBLAS 0.3.20, FFTW 3.3.10, ScaLAPACK 2.2.0 OpenMPI 4.1.4 11.7.0*
foss/2021a GCC 10.3.0 OpenBLAS 0.3.15, FFTW 3.3.9, ScaLAPACK 2.1.0 OpenMPI 4.1.1 11.3.1*
fosscuda/2020b GCC 10.2.0 OpenBLAS 0.3.12, FFTW 3.3.8, ScaLAPACK 2.1.0 OpenMPI 4.0.5 11.1.1

CUDA versions marked with a * are not loaded by default by the foss toolchain.

Requesting Installs

Application installs can be requested by contacting your site representative. Generally we will install applications centrally where the application is of relevance to multiple Deceema users.

Licensed Software

For software that requires a license server, those using the software are responsible for setting up and operating the license server.

Containerisation using Apptainer

Deceema supports containerisation using Apptainer. Each compute node has Apptainer installed, which means that the apptainer command is available without needing to first load a module.

EasyBuild and ReFrame

We use

  • EasyBuild to manage the installation of the applications on Deceema. This enables us to produce repeatable installations of the applications requested by researchers while leveraging the work of the HPC community.

  • ReFrame to test the installed applications. This enables us to run test jobs every day to verify both Deceema and the installed applications.

Software (self-)Installation

Requesting an Installation

We provide a number of applications on Deceema. If an application or specific application version is not available on Deceema, please first consider requesting an install. Self-installation is recommended for users who may be developing or compiling their own code, or who require software unlikely to be used by anyone else.

This section provides information on self-installing Python, C/C++ and Fortran software. All installations should be done on a compute node; avoid installing software on login nodes. You can use a compute node by either submitting a job script which installs the software, or by starting an interactive job.

Installation on Login Nodes

Login nodes do not have GPUs or CUDA available. Also, we may kill, without warning, any long-running or CPU intensive process running on a login node.

Self-installing Python software

We provide some Python software. To install your own Python software on top of these we recommend using a virtual environment. To create this:

  1. Load the required modules

  2. In a suitable directory, create a virtual environment:

        python -m venv --system-site-packages _venvname_
    
  3. Activate the virtual environment

        source _venvname_/bin/activate
    
  4. Install your Python software, for example using pip install _packagename_

To use the Python software installed in the virtual environment:

  1. Load the same modules as used when creating the virtual environment

  2. Activate the virtual environment

        source _venvname_/bin/activate
    
  3. Use your Python software

Self-installing C/C++/Fortran software

We provide a number of tools to help you with your software development needs.

On Deceema we provide several families of compilers. In addition, the installed applications include include standard libraries and build tools, which may prove useful building blocks for your own installations.

Containerisation with Apptainer

Apptainer docs

This documentation page provides a brief overview of using Apptainer on Deceema, along with some specifics concerning its implementation. However, for further information please refer to the relevant version of the Apptainer official documentation.

Deceema supports containerisation using Apptainer. The apptainer command is available on all nodes (although note the warning below regarding /tmp) and does not require the explicit loading of a corresponding module.

Note: Apptainer replaces Singularity

The Apptainer application replaces the previous containerisation solution, Singularity. It is broadly a drop-in replacement and for now the command singularity will continue to function, although it will actually run Apptainer. For further information on the move from Singularity to Apptainer please see this article: https://apptainer.org/news/community-announcement-20211130

Pulling images and running containers

/tmp directory size

Due to the limited size of the /tmp directory on the Deceema login nodes, please run all apptainer pull commands inside Slurm jobs and therefore on the compute nodes. (We recommend using an Interactive Job to do this.)

To pull an image from Docker Hub:

    apptainer pull docker://python
    apptainer pull docker://python:3.8.11  # pulls a specific tag

Two common methods for using an Apptainer container are exec and shell:

  • apptainer exec is for where you want to run a command in a container and then exit the container as soon as the command completes

  • apptainer shell is for where you want to launch a container and then attach to its shell

An example of apptainer exec:

    apptainer exec python_3.8.11.sif python --version

The above command will spawn a container from the specified image file, execute the command python --version, print the output and then exit the container.

Using GPUs with containers

Accessing GPU resources inside your container is as simple as adding the --nv flag when starting your container. For example, if you run

apptainer exec python_3.8.11.sif nvidia-smi

you will get a FATAL error that the nvidia-smi executable is not found. However, if you instead run

apptainer exec --nv python_3.8.11.sif nvidia-smi

you will see information about the GPUs you have been allocated for your job.

Building containers

--fakeroot not required!

If you have previous experience of building Singularity containers please note that it’s no longer necessary to build using --fakeroot as the required privilege escalation is handled automatically.

To build an image from an Apptainer Definition File, please execute the following commands:

unset APPTAINER_BIND
apptainer build my_image.sif my_image_definition.def

Resetting APPTAINER_BIND

After you’ve built your image you may want to reset APPTAINER_BIND with export APPTAINER_BIND=/hype so that you can access your project directory from your container. You can bind other directories, such as /scratch-global by passing the --bind option to your exec or shell command.

Interactive Development

To test development of an Apptainer image interactively use the --sandbox facility, which builds the image as a directory that can then be run with the --writable option.

Suggested workflow:

  1. Run: unset APPTAINER_BIND

  2. Create a sandbox directory either…

    a. from a base OS image, e.g. Rocky Linux:

    apptainer build --fix-perms --sandbox "/tmp/${USER}/my-sandbox-dir" docker://rockylinux:9.4
    

    or…

    b. From an Apptainer definition file:

    apptainer build --fix-perms --sandbox "/tmp/${USER}/my-sandbox-dir" ./my-definition-file.def
    

  3. Run the sandbox as a container in writeable mode with “root” privileges:

    apptainer shell --fakeroot --writable "/tmp/${USER}/my-sandbox-dir"
    
  4. Perform the necessary package installs and test your image’s functionality iteratively.

  5. Write the required commands back into an Apptainer Definition File.

  6. Exit the sandbox container.

  7. Build the image from the resultant definition file as per the instructions above.

Apptainer’s cache and temporary directories

By default, Apptainer uses the following directory to store its cache: ~/.apptainer/cache. However, due to the limited size of users’ home directories on Deceema we move the Apptainer cache to the following location: /tmp/apptainer/cache. This means that the Apptainer cache does not persist between jobs. There could be situations where it is desirable to retain the cache between jobs, in which case please define the following environment variable as appropriate, whilst being mindful of available storage space:

export APPTAINER_CACHEDIR=/path/to/preferred/cache/directory

Using VS Code’s Remote Tunnels extension

This page contains information on how to leverage VS Code’s built-in Remote Tunnels extension on Deceema. This allows you to run VS Code locally to debug the code you are running on Deceema.

GitHub account required

This process requires that you connect VS Code with a GitHub account. If you don’t already have a GitHub account then please create one here: https://github.com/signup

The VS Code Remote Tunnels extension lets you connect to a remote machine via a secure tunnel. The following information outlines the process for leveraging this facility for use on Deceema.

Note that these docs paraphrase sections of the official Remote Tunnels documentation – for further details please therefore refer to the information on the following URL: https://code.visualstudio.com/docs/remote/tunnels

Process Overview

  1. Install the VS Code application on your local machine, if you don’t already have it.

  2. Install the VS Code Remote Tunnels extension.

  3. Connect VS Code to your GitHub account.

  4. Submit a batch job using the example below. Note that you may want to:

    • Adjust the #SBATCH headers for the resource that you require.

      Please be considerate of other users when requesting resources for your session. There is potential for Remote Tunnel jobs to be underutilised whilst still consuming Baskerville resource.

    • Add module load commands as necessary, for example loading a specific Python version.

  5. Tail (tail -f) the slurm.out file generated by the batch job (see here for more info). Follow the link to authenticate VS Code via GitHub, pasting the given code as required.

    Tip

    You can copy the whole code, e.g. AAAA-BBBB as written and paste it directly into the first of the boxes on the GitHub Device Activation page(https://github.com/login/device) - there is no need to paste or type the characters individually.

  6. Once the tunnel is active you can connect using your local VS Code’s Remote Tunnels extension. Please refer to the following documentation for further info: https://code.visualstudio.com/docs/remote/tunnels#_remote-tunnels-extension

    Browser Access to Remote Server

    It’s also possible to connect to the remote instance in a browser, using link given in the slurm.out file, e.g. https://vscode.dev/tunnel/some-randomly-assigned-tunnel-name

  7. Cancel your job, using scancel (see here), after you have finished using the Remote Tunnel.

    Idle Resources

    You must cancel you job once you have finished using the tunnel. This is so that the Deceema resources are freed for use by other jobs.

Example Batch Script

The following script will perform the following actions:

  • Start a job on a Deceema compute node.

  • Check whether the VS Code CLI binary is available, downloading it if necessary.

  • Run the VS Code Remote Tunnel, passing

  • --accept-server-license-terms.

tunnel_job.sh
#!/bin/bash

#SBATCH --ntasks 8
#SBATCH --nodes 1
#SBATCH --time 0-1  # run for 1 hour
#SBATCH --account _projectaccount_
#SBATCH --qos _userqos_

set -e

module purge; module load deceema

# add any required module loads here, e.g. a specific Python

CLI_PATH="${HOME}/vscode_cli"

# Install the VS Code CLI command if it doesn't exist
if [[ ! -e ${CLI_PATH}/code ]]; then
    echo "Downloading and installing the VS Code CLI command"
    mkdir -p "${HOME}/vscode_cli"
    pushd "${HOME}/vscode_cli"
    # Process from: https://code.visualstudio.com/docs/remote/  tunnels#_using-the-code-cli
    curl -Lk 'https://code.visualstudio.com/sha/download?build=stable&  os=cli-alpine-x64' --output vscode_cli.tar.gz
    # unpack the code binary file
    tar -xf vscode_cli.tar.gz
    # clean-up
    rm vscode_cli.tar.gz
    popd
fi

# run the code tunnel command and accept the licence
${CLI_PATH}/code tunnel --accept-server-license-terms

Additional Information

Data in your home directory

The VS Code CLI (i.e. the remote end of the tunnel) stores its data in the following directory: ${HOME}/.vscode-cli

This directory in turn contains subdirectories (located in ${HOME}/.vscode-cli/server-stable/bin), for each of the local VS Code versions from which you connect. Note that these versions will accumulate over time so we recommend periodically deleting this directory to ensure that only the versions you are currently using reside there.

Exclude paths from file watching

VS Code uses a file watcher to monitor when files are created, modified or deleted from your home directory. For a shared filesystem like Deceema, this task can become resource-intensive if there are a lot of files. Use the files.watcherExclude setting in VS Code to specify the paths you would like to remove from automatically indexing (there is no need to exclude symlinks from your home folder). See here for background and here for a VS Code settings example that contains files.watcherExclude.

Remote Tunnel Security

For info on how the tunnels are secured, see: https://code.visualstudio.com/docs/remote/tunnels#_how-are-tunnels-secured