Batch jobs#
The computational power of the HPC facilities at Leeds is organised through a batch job scheduling system. This involves users submitting a script that outlines the resources required, and the program to be run, to the scheduler which allocates that job a position in the queue of jobs. The scheduler software that runs on both ARC3 and ARC4 is Son of Grid Engine, plus locally developed and implemented patches.
Job scripts#
The scripts submitted are referred to as job scripts or job submission scripts. These are shell scripts (files ending .sh
) and at a bare minimum specify:
how long the job needs to run for
on how many processors to run (assumed 1 unless otherwise specified)
With this information, the scheduler is able to run jobs at some point in the future when the resources become available. Crucially, the queue is not first-come-first-serve and implements a fair-share policy to guide the scheduler towards allocating resources fairly between different faculties and users.
The common commands used on HPC to interact with batch jobs are:
qsub
- submits a job script to the schedulerqstat
- checks the status of submitted jobsqdel
- deletes a specified job from the queue
Writing jobs scripts on HPC#
We encourage users to write their job submission scripts using text editor tools available on ARC3 and ARC4 such as:
nano
(recommended for beginners)gedit
vim
emacs
The basic approach to create a new job submission file on HPC would be nano job_submit.sh
or vim job_submit.sh
. This opens the new empty file in the text editor ready for you to write its contents.
Warning
Job scripts written on Windows computers contain different invisible line ending characters that lead to job submission failures such as
/bin/bash^M: bad interpreter: No such file or directory
You can use the command dos2unix job_script.sh
on HPC to convert your script to the correct line endings.
The Hello world job script#
In this basic hello world job script example we’ve got a job script called job_script1.sh
that requests a single core, 1GB of RAM, and 15 minutes of run time in order to run some R code.
#R single core submission script
#Run with current environment (-V) and in the current directory (-cwd)
#$ -V -cwd
#Request some time- min 15 mins - max 48 hours
#$ -l h_rt=00:15:00
#Request some memory per core
#$ -l h_vmem=1G
#Get email at start and end of the job
#$ -m be
#Now run the job
module load R
R CMD BATCH R.in R.out
We can submit this script to the scheduler by using qsub
:
$ qsub job_script1.sh
Your job 42 ("job_script1.sh") has been submitted
This returns some text to confirm our job has been submitted and provides us with the jobs unique ID number (in this case 42).
Note
Array jobs
If you intend to runs multiple similar jobs with the same resource specification, we recommend using the task array system.
Resource specification#
The job script specified above includes a number of lines that request some amount of compute resource for our job. This is defined by the syntax #$ <option>
. These lines are commented out of the shell script but are read by the scheduler to determine how much compute resource is required and thus how to fit the job into the queue.
The first things we specify in the example script above were:
#$ -V -cwd
-V
tells the scheduler to use the current environment (including environment variables and loaded modules) and -cwd
tells the scheduler to run the job within the current directory (helping ensure any paths specified are correct). These are commonly included in all job scripts.
Next we have the lines requesting a time allocation and the amount of memory required per core:
#$ -l h_rt=00:15:00
#$ -l h_vmem=1G
Here -l h_rt=hh:mm:ss
is a request for a specific amount of runtime, with a maximum limit of 48h. The line -l h_vmem=
requests a specific amount of memory per core, in the example we request 1 GB to run on 1 core.
To use multiple cores within a node using the OpenMP protocol you include the following option:
#$ -pe smp np
Where np
is the number of cores you wish to request. The maximum number of cores you can request using smp
is the total number of cores available on a node (ARC4 - 40, ARC3 - 24).
Warning
If your job attempts to run for longer than the amount of time or use more memory per core than requested this will cause the scheduler to kill your job. This is one of the most common problems people encounter, read more on the troubleshooting page.
Next we have options to request notifications about the job:
#$ -m be
The option -m
will specify that we wish to receive an email about the job, in this case be
at the start and the end of the job. This will automatically be sent to your University of Leeds email address.
List of SGE options#
Option |
Description |
Default |
---|---|---|
|
The wall clock time (amount of real time needed by the job). This parameter must be specified, failure to include this parameter will result in an error message. |
Required |
|
Sets the limit of virtual memory required per core. If you require more memory than 1GB/process you must specify this flag. e.g. |
1G |
|
Specifies the shared memory parallel environment for parallel programs using OpenMP/threads. |
1 |
|
Specifies the parallel environment for parallel programs using MPI, |
|
|
Specifies a job for parallel programs using MPI. Assigns whole compute nodes. |
|
|
Specifies a job for parallel programs using MPI. Assigns whole compute nodes. |
|
|
Specifies the type of node to be used. Where |
40core-192G (ARC4) 24core-128G (ARC3) |
|
A wrapper line for requesting resources on GPGPU nodes, see GPGPU page for more details. Passing a number between 1 and 4 requests a proportion of the resources on GPGPU nodes. |
|
|
These parameters control local disk usage. Please refer to Temporary/Scratch Storage on Compute nodes for more details. |
|
|
Hold the job until the previous job ( |
|
|
Choose optimal for launching a process topology which minimises the number of infiniband switch hops used in the calculation, minimising latency. Choose |
|
|
Produce an array of sub-tasks (loop) from |
|
|
Submit the job to a specific private node queue. Where |
|
|
Prints a list of options. |
|
|
Execute the job from the current working directory; output files are sent to the directory form which the job was submitted, not to the user’s home directory. |
|
|
Export all current environment variables to all spawned processes. Necessary for current module environment to be transferred to SGE shell. |
|
|
Send mail at the beginning ( |
|
|
Specify mail address for |
Your user email address |
|
Combine the standard error and standard output into one file. |
|
|
Give the job a specific name. Where |
Name of submission script |
Monitoring jobs#
Once you’ve submitted a job you can monitor its progress in the queue using the command qstat JOBID
where JOBID
is the unique numeric ID of your submitted job.
In the example below we’ve just submitted a job and want to check its status in the queue.
$ qstat 54
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
54 0.00000 test_sub.s exuser qw 08/21/2019 14:09:08 1
Using qstat
returns a table with the following columns:
the job ID
the job priority determined by the scheduler fair share policy
the name of the submission script
the user who submitted the job
the jobs state in the queue (
qw
for waiting;r
for running;hqw
for hold waiting,Eqw
for errored whilst queueing,t
for transferring state)the submission time, or if the job is running the start time
the number of slots (cores) requested
the task ID number if the job is a task array
Note
If you run qstat
expecting an output and nothing happens and your prompt is returned it means you have no jobs currently in the queue. This suggests your job has completed.
You can also pass the qstat
command a number of additional arguments to view the queue or other users queues.
$ qstat -u 'username'
# where username is your own or another username
$ qstat -u '*'
# will return the entire queue
Deleting jobs#
Sometimes a situation might arise where you need to delete one of your submitted jobs from the queue. You can do this with the straightforward command qdel JOBID
where JOBID
is the unique numeric ID of the job we wish to delete.
When the job is successfully deleted we get the following output:
$ qdel 42
exuser has deleted job 42
Note
You can only use qdel
to delete your own submitted jobs from the queue, so don’t try and be smart and clear the queue just for your jobs as it won’t work.
Job output#
When a job runs it produces two output files by default even if you haven’t specified your code to write a results file. These contain information from the standard output and standard error produced by your job and are are named following the pattern submission_script.sh.oJOBID
and submission_script.sh.eJOBID
. Where submission_script.sh
is the name of your job script and JOBID
is the unique numeric ID of the job when it was submitted.
For example, if we submitted a job called test_run.sh
and it was given the job ID 4689
we’d expect the following files produced alongside any results files:
test_run.sh.o4689
test_run.sh.e4689
Both these files contain useful information about how the job progressed and are especially useful if your job encountered an error. You can read more about using these files to help troubleshoot problems in the troubleshooting section.
Job holding#
Often a workflow can involve a number of steps in a process, where each steps requires the outputs of the previous and should only start when the previous step completes. The scheduler has a system of job dependencies built into it that allows you to submit a series of jobs and specify that jobs should be on hold until another job has completed.
For instance you could do this with two submission scripts job1.sh
and job2.sh
, where job2.sh
should only begin once job1.sh
has finished. You can specify this when submitting jobs as follows:
$ qsub job1.sh
Your job 626 ("job1.sh") has been submitted
$ qsub -hold_jid 626 job2.sh
Your job 627 ("job2.sh") has been submitted
job2.sh
will then be held in the queue until job1.sh
is completed.
This allows for developing a workflow of jobs that run one after another to complete long stepwise tasks.