General Purpose GPU

The HPC systems at Leeds also include general purpose GPU facilities for GPU-accelerated code. These differ between ARC3 and ARC4 but can be requested through the job submission script process as outlined below.

GPUs are not available on the login nodes, so a job must be submitted with the required arguments to run code that requires GPUs. Specific guidance on the required parameters for using a GPU are listed below, but general guidance on submitting jobs to the queues can be found in the Batch jobs section.

CUDA Module

To use the GPU cards you will need to ensure that the NVIDIA CUDA toolkit module is loaded into your environment:

$ module load cuda
# This needs to be done before compiling or running GPU code

Note that this version of the CUDA environment will only work with certain compiler versions:

  • INTEL versions 15 and 16

  • PGI versions >= 16.3

  • GNU version 4.8.2 (this is the gnu/native compiler)

To confirm what cards you have been allocated use the command:

$ nvidia-smi -L

ARC3

The ARC3 system is the first cluster at Leeds to include GPU accelerator technologies. These include:

  • 2 nodes each with 24 cores, 128GB of system memory, a hard disk drive within the node with 800GB capacity and 2 x NVIDIA K80 24Gbytes

  • 6 nodes each with 24 cores, 256GB of system memory, an SSD within the node with 650GB capacity and 4 x NVIDIA P100 12Gbytes

Usage

K80 GPU

To request K80 GPU resource you should use the flag:

#$ -l coproc_k80=<cards_per_compute_node>

Where <cards_per_compute_node> should be set to 1 or 2.

Job script line

Requested resource

#$ -l coproc_k80=1

A single K80 card, 12 CPU cores and 64GB system memory (half the available resource on a K80 node)

#$ -l coproc_k80=2

Both K80 cards, 24 CPU cores and 128GB system memory (all the available resource on a K80 node)

P100 GPU

To request P100 GPU resource you should use the flag:

#$ -l coproc_p100=<cards_per_compute_node>

Where <cards_per_compute_node> should be set to 1, 2, 3 or 4.

Job script line

Requested resource

#$ -l coproc_p100=1

A single P100 card, 6 CPU cores and 64GB system memory (one quarter of the available resource on a P100 node)

#$ -l coproc_p100=2

Two P100 cards, 12 CPU cores and 128GB system memory (half of the available resource on a P100 node)

#$ -l coproc_p100=3

Three P100 cards, 18 CPU cores and 192GB system memory (three-quarters of the available resource on a P100 node)

#$ -l coproc_p100=4

Four P100 cards, 24 CPU cores and 256GB system memory (all the available resource on a P100 node)

ARC4

The ARC4 system also comes with the following GPU resources:

  • 3 nodes each with 40 cores, 192GB of system memory, an SSD within the node with 128GB capacity and 4 x NVIDIA V100 32Gbytes

Usage

V100 GPU

To request V100 GPU resource you should use the flag:

#$ -l coproc_v100=<cards_per_compute_node>

Where <cards_per_compute_node> should be set to 1, 2, 3 or 4.

Job script line

Requested resource

#$ -l coproc_v100=1

A single V100 card, 10 CPU cores and 48GB system memory (one quarter of the available resource on a V100 node)

#$ -l coproc_v100=2

Two V100 cards, 20 CPU cores and 96GB system memory (half of the available resource on a V100 node)

#$ -l coproc_v100=3

Three V100 cards, 30 CPU cores and 144GB system memory (three-quarters of the available resource on a V100 node)

#$ -l coproc_v100=4

Four V100 cards, 40 CPU cores and 192GB system memory (all the available resource on a V100 node)