Difference between revisions of "General/Slurm"

Revision as of 10:58, 2 February 2017

Application Details

Version: 15.08.8
Further information: https://slurm.schedmd.com/
Slurm Rosetta (Useful for converting submission scripts from other formats)

Introduction

The SLURM (Simple Linux Utility for Resource Management workload manager is a free and open-source job scheduler for the Linux kernel. It is used by Viper and many of the world's supercomputers (and clusters).

Common Slurm Commands

Command	Description
sbatch	Submits a batch script to SLURM. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input.
squeue	Used to view job and job step information for jobs managed by SLURM.
scancel	Used to signal or cancel jobs, job arrays or job steps.
sinfo	Used to view partition and node information for a system running SLURM.

sbatch

[username@login01 ~]$ sbatch jobfile.job
Submitted batch job 289535

squeue

[username@login01 ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            306414   compute  clasDFT   495711  R      16:36      1 c006
            306413   compute mpi_benc   535286  R      31:02      2 c[005,007]
            306411   compute  orca_1n   442104  R    1:00:32      1 c004
            306410   compute  orca_1n   442104  R    1:04:17      1 c003
            306409   highmem cnv_obit   524274  R   11:37:17      1 c232
            306407   compute  20M4_20   535822  R   11:45:54      1 c012
            306406   compute 20_ML_20   535822  R   11:55:40      1 c012

scancel

[username@login01 ~]$ scancel 289535

sinfo

[username@login01 ~]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute*     up 2-00:00:00      9    mix c[006,012,014,016,018-020,022,170]
compute*     up 2-00:00:00     11  alloc c[003-004,008,015,046,086,093,098,138,167-168]
compute*     up 2-00:00:00    156   idle c[001-002,005,007,009-011,013,017,021,023-045,047-085,087-092,094-097,099-137,139-166,169,171-176]
highmem      up 4-00:00:00      1    mix c230
highmem      up 4-00:00:00      2  alloc c[231-232]
highmem      up 4-00:00:00      1   idle c233
gpu          up 5-00:00:00      4   idle gpu[01-04]

Common Submission Flags

Flag	Description
-J / --job-name	Specify a name for the job
-N / --nodes	Specifies the number of nodes to be allocated to a job
-n / --ntasks	Specifies the allocation of resources e.g. for 1 Compute Node the maximum would be 28
-o / --output	Specifies the name of the output file
-e / --error	Specifies the name of the error file
-p / --partition	Specifies the specific partition for the job e.g. compute, highmem, gpu
--exclusive	Requests exclusive access to nodes preventing other jobs from running

@@ Line 3: / Line 3: @@
 * Further information: [https://slurm.schedmd.com/ https://slurm.schedmd.com/]
 * [https://slurm.schedmd.com/rosetta.pdf Slurm Rosetta] (Useful for converting submission scripts from other formats)
+==Introduction==
+The SLURM (Simple Linux Utility for Resource Management workload manager is a free and open-source job scheduler for the Linux kernel. It is used by Viper and many of the world's supercomputers (and clusters).
 ==Common Slurm Commands==

HPC

Difference between revisions of "General/Slurm"

Revision as of 10:58, 2 February 2017

Contents

Application Details

Introduction

Common Slurm Commands

sbatch

squeue

scancel

sinfo

Common Submission Flags

Navigation

Support

Research

Tools