Advanced Job Examples
Advanced Job Examples¶
Submitting Mixed Mode Jobs¶
The ‘mixed mode’ (MPI+OpenMP) programming model typically involves MPI processes running across nodes and OpenMP threads upon each node with the total number of processes (MPI*OpenMP) equalling the number of physical processor cores.
Your code will need to call MPI_Init and make use of OpenMP directives. You will compile your code using an MPI wrapper and enabling OpenMP support, for example:
$ mpif90 -openmp example.f90 -o mixed.exe
You will need to determine
ppn, the number of MPI processes per node, and
tpp, the number of OpenMP threads per MPI process.
Additionally, you can either ask for a given number of nodes nodes or for the total number of MPI processes
np. Note that
ppn is related to
Your submission script would then need to contain:
#$ -V #$ -l hr_t=01:00:00 #$ -l nodes=<nodes>, ppn=<ppn>, tpp=<tpp> mpirun ./a.out
#$ -V #$ -l hr_t=01:00:00 #$ -l np=<np>, ppn=<ppn>, tpp=<tpp> mpirun ./a.out
There are 24 cores per node on ARC3 , so you would typically ensure
There are 40 cores per node on ARC4 , so you would typically ensure
Using ARC4, To run an MPI+OpenMP executable mixed.exe with 80 MPI processes each launching 4 OpenMP threads, the following submission script would be needed:
#$ -V #$ -cwd #$ -l hr_t=01:00:00 #$ -l np=80, ppn=10, tpp=4 mpirun ./mixed.exe
This will allocate 8 nodes (=8*40=320 cores). Each node will have 10 MPI processes, each of which will have 4 OpenMP threads (so 10*4=40 processes per node in total, and 8*140=320 (=80MPI*4OpenMP) processes in total.
Alternatively, the same effect can be achieved by:
#$ -V #$ -cwd #$ -l hr_t=01:00:00 #$ -l nodes=8, ppn=10, tpp=4 mpirun ./mixed.exe
Note that the
OMP_NUM_THREADS environment variable is automatically set by the batch system and so you do not need to set this in your environment.
The SGE scheduler allows you to submit a job but then hold them in the queue until certain job dependencies are met, ie. it will hold the job until a previous job (or number of jobs have completed).
To do this, submit the first job as normal:
$ qsub job1.sh
As usual, the scheduler will give you a job ID
if you then want to submit another job that is dependent on the first one completing, use the
-hold_jid option for qsub:
$ qsub -hold_jid <jobid> job2.sh
Job 2 will remain in the queue until job 1 has completed.
Alternatively, if you have given the job a name (using the
-N option), then you can put a hold on subsequent jobs using the name instead:
$ qsub -hold_jid <jobname> job2.sh
If you give several jobs the same name, then the job will be held until all the named jobs have been completed.
See Batch jobs for more details on