Difference between revisions of "OpenHPC"

From HPC
Jump to: navigation , search
(Created page with "__TOC__ == Job Emails == == Slurm Information == It is now possible to check the details of a job submission script used to submit a job. This is done using '''sacct -B -j <jo...")
 
(Job Emails)
Line 1: Line 1:
 
__TOC__
 
__TOC__
 
== Job Emails ==
 
== Job Emails ==
 +
It is now possible to get emails alerts when certain event types occur using Slurms built in --mail-type SBATCH directive support.
 +
 +
The most commonly used valid type values are (multiple type values may be specified in a comma separated list):
 +
 +
* NONE (the default if you don't set --mait-type
 +
* BEGIN
 +
* END
 +
* FAIL
 +
* REQUEUE
 +
* ALL (equivalent to BEGIN, END, FAIL, INVALID_DEPEND, REQUEUE, and STAGE_OUT)
 +
* INVALID_DEPEND (dependency never satisfied)
 +
* TIME_LIMIT, TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent of time limit), TIME_LIMIT_50 (reached 50 percent of time limit) * ARRAY_TASKS (sends emails for each array task otherwise job BEGIN, END and FAIL apply to a job array as a whole rather than generating individual email messages for each task in the job array).
 +
 +
The user to be notified is indicated with --mail-user, however only @hull.ac.uk email addresses are valid.
 +
 
== Slurm Information ==
 
== Slurm Information ==
 
It is now possible to check the details of a job submission script used to submit a job. This is done using '''sacct -B -j <jobnumber>''' e.g.:
 
It is now possible to check the details of a job submission script used to submit a job. This is done using '''sacct -B -j <jobnumber>''' e.g.:

Revision as of 16:38, 6 January 2022

Job Emails

It is now possible to get emails alerts when certain event types occur using Slurms built in --mail-type SBATCH directive support.

The most commonly used valid type values are (multiple type values may be specified in a comma separated list):

  • NONE (the default if you don't set --mait-type
  • BEGIN
  • END
  • FAIL
  • REQUEUE
  • ALL (equivalent to BEGIN, END, FAIL, INVALID_DEPEND, REQUEUE, and STAGE_OUT)
  • INVALID_DEPEND (dependency never satisfied)
  • TIME_LIMIT, TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent of time limit), TIME_LIMIT_50 (reached 50 percent of time limit) * ARRAY_TASKS (sends emails for each array task otherwise job BEGIN, END and FAIL apply to a job array as a whole rather than generating individual email messages for each task in the job array).

The user to be notified is indicated with --mail-user, however only @hull.ac.uk email addresses are valid.

Slurm Information

It is now possible to check the details of a job submission script used to submit a job. This is done using sacct -B -j <jobnumber> e.g.:

$ sacct -B -j 317
Batch Script for 317
--------------------------------------------------------------------------------
#!/bin/bash
#SBATCH -J jobsubmissionfile
#SBATCH -n 1
#SBATCH -o slurm-%j.out
#SBATCH -e slurm-%j.out
#SBATCH -p compute
#SBATCH --exclusive
#SBATCH --time=1-00:00:00
#SBATCH --mail-type=END,FAIL
#SBATCH --mail-user=<your Hull email address>

echo "This is my submission script"
sleep 10

Using Containers