Difference between revisions of "General/Slurm"

Revision as of 15:52, 13 March 2017

Application Details

Description: SLURM is an open-source job scheduler, used by HPCs.
Version: 15.08.8

Introduction

The SLURM (Simple Linux Utility for Resource Management) workload manager is a free and open-source job scheduler for the Linux kernel. It is used by Viper and many of the world's supercomputers (and clusters).

First, it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work.
Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes.
Thirdly, it arbitrates contention for resources by managing a queue of pending jobs.

Slurm takes your batch job submission and execute it across the computing nodes of Viper. How it is processed will depend on a number of factors including the queue it is submitted to, jobs already submitted to the queue etc.

Common Slurm Commands

Command	Description
sbatch	Submits a batch script to SLURM. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input.
squeue	Used to view job and job step information for jobs managed by SLURM.
scancel	Used to signal or cancel jobs, job arrays or job steps.
sinfo	Used to view partition and node information for a system running SLURM.

sbatch

Used to submit a job to Slurm.

[username@login01 ~]$ sbatch jobfile.job
Submitted batch job 289535

The number displayed (289535) is the Job ID

squeue

squeue shows information about jobs in the scheduling queue

[username@login01 ~]$ squeue
             JOBID  PARTITION     NAME   USER ST       TIME  NODES NODELIST(REASON)
             306414   compute  clasDFT   user  R      16:36      1 c006
             306413   compute mpi_benc   user  R      31:02      2 c[005,007]
             306411   compute  orca_1n   user  R    1:00:32      1 c004
             306410   compute  orca_1n   user  R    1:04:17      1 c003
             306409   highmem cnv_obit   user  R   11:37:17      1 c232
             306407   compute  20M4_20   user  R   11:45:54      1 c012
             306406   compute 20_ML_20   user  R   11:55:40      1 c012

Heading	Description
JOBID	The unique identifier assinged to a job
PARTITION	The type of node the job is running on e.g. compute, highmem, GPU
NAME	Name of job
USER	User ID of job owner
ST	Job state code e.g. R stands for 'Running'
TIME	Length of time a job has been running
NODES	Amount of nodes a job is running on
NODELIST(REASON)	List of nodes a job is running on, also provides reason a job is not running e.g. a dependency on a node.

scancel

scancel is used to cancel currently running jobs. Only jobs running under your userid may be canceled.

[username@login01 ~]$ scancel 289535

No output is given by the command.

sinfo

sinfo shows information on the partitions and nodes in the cluster.

[username@login01 ~]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute*     up 2-00:00:00      9    mix c[006,012,014,016,018-020,022,170]
compute*     up 2-00:00:00     11  alloc c[003-004,008,015,046,086,093,098,138,167-168]
compute*     up 2-00:00:00    156   idle c[001-002,005,007,009-011,013,017,021,023-045,047-085,087-092,094-097,099-137,139-166,169,171-176]
highmem      up 4-00:00:00      1    mix c230
highmem      up 4-00:00:00      2  alloc c[231-232]
highmem      up 4-00:00:00      1   idle c233
gpu          up 5-00:00:00      4   idle gpu[01-04]

Common Submission Flags

For ease and repetition it is much easier to build Slurm commands into batch files, the following are an example of the most commonly used Slurm batch file commands.

Flag	Description
-J / --job-name	Specify a name for the job
-N / --nodes	Specifies the number of nodes to be allocated to a job
-n / --ntasks	Specifies the allocation of resources e.g. for 1 Compute Node the maximum would be 28
-o / --output	Specifies the name of the output file
-e / --error	Specifies the name of the error file
-p / --partition	Specifies the specific partition for the job e.g. compute, highmem, gpu
--exclusive	Requests exclusive access to nodes preventing other jobs from running

Example Job Submission Script

#!/bin/bash

#SBATCH -J Example_Slurm_Job
#SBATCH -N 1
#SBATCH -n 28
#SBATCH -o %N.%j.%a.out
#SBATCH -e %N.%j.%a.err
#SBATCH -p compute
#SBATCH --exclusive

echo $SLURM_JOB_NODELIST

module purge
module add gcc/4.9.3

export I_MPI_DEBUG=5
export I_MPI_FABRICS=shm:tmi
export I_MPI_FALLBACK=no

/home/user/slurmexample

For more information on creating batch jobs visit the Batch Jobs guide

Further Information

Slurm Website: https://slurm.schedmd.com/
Slurm Rosetta (Useful for converting submission scripts from other formats)
You might find applications' specific submission scripts here Application support

@@ Line 94: / Line 94: @@
 ===sinfo===
+sinfo shows information on the partitions and nodes in the cluster.
 <pre style="background-color: #000000; color: white; border: 2px solid black; font-family: monospace, sans-serif;">
 [username@login01 ~]$ sinfo

HPC