General/Interactive
Introduction
An interactive session can be started on Viper that can be used for any task that requires interaction and should be used for any computationally demanding task rather than using the login node.
Examples for interactive usage include:
- Code compilation
- Data analysis
- Basic visualisation
- Console based interactive applications such as Python, R, Matlab, SAS or Stata
- Graphical user interfaces such as Matlab, SAS or Stata
Interactive Sessions
Starting an interactive session
An interactive session can be started by using the interactive command:
[username@login01 ~]$ interactive salloc: Granted job allocation 306844 Job ID 306844 connecting to c068, please wait... [username@c068 ~]$
To exit from an interactive session just type exit:
[username@c068 ~]$ exit logout salloc: Relinquishing job allocation 306844 [username@login01 ~]$
By default the interactive command will give you an allocation to a single compute core on a node for 12 hours and a standard 4GB of RAM. This can be adjusted in the following ways:
Exclusive interactive session
[username@login01 ~]$ interactive --exclusive salloc: Granted job allocation 306848 Job ID 306848 connecting to c174, please wait...
Note: This will give you the whole node for your job exclusively, if this is not specified there may be other jobs that are running on the allocated node at the same time. For example if you job requires a significant about of processing cores this should be specified, single processing core tasks would not require this and would leave the other process cores idle. (See Interactive session with additional CPU cores below).
Interactive session with additional CPU cores
[username@login01 ~]$ interactive -n24 salloc: Granted job allocation 306849 Job ID 306849 connecting to c174, please wait...
Interactive session with additional RAM
[username@login01 ~]$ interactive --mem=24G salloc: Granted job allocation 306852 Job ID 306852 connecting to c068, please wait...
Note: if a job exceeds the requested about of memory, it will terminate with an error message similar to the following (a job which ran with a memory limit of 2GB):
slurmstepd: Step 307110.0 exceeded memory limit (23933492 > 2097152), being killed srun: Job step aborted: Waiting up to 32 seconds for job step to finish. srun: got SIGCONT slurmstepd: Exceeded job memory limit
Interactive session using a different partition
High memory node
[username@login01 ~]$ interactive -phighmem salloc: Granted job allocation 306153 Job ID 306153 connecting to c233, please wait...
GPU node with a single GPU
[username@login01 ~]$ interactive -pgpu salloc: Granted job allocation 306855 Job ID 306855 connecting to gpu02, please wait...
GPU node with all 4 GPUs and exclusive
[username@login01 ~]$ interactive -pgpu --gres=gpu:tesla:4 --exclusive salloc: Granted job allocation 1043984 Job ID 1043984 connecting to gpu02, please wait... Last login: Fri May 18 11:03:13 2018 from login01 [username@gpu02 ~]$
Interactive session with a node reservation
This example is for a reservation of 327889 and the partition (queue) GPU, missing the partition name will default to the compute queue.
[username@login01 ~]$ interactive -pgpu --reservation=327889 salloc: Granted job allocation 306353 Job ID 306353 connecting to gpu04, please wait...
More Information
More information can be found by typing the following (based on slurm 15.08.8):
[username@login01 ~]$ interactive --help Parallel run options: -A, --account=name charge job to specified account --begin=time defer job until HH:MM MM/DD/YY --bell ring the terminal bell when the job is allocated --bb=<spec> burst buffer specifications --bbf=<file_name> burst buffer specification file -c, --cpus-per-task=ncpus number of cpus required per task --comment=name arbitrary comment --cpu-freq=min[-max[:gov]] requested cpu frequency (and governor) -d, --dependency=type:jobid defer job until condition on jobid is satisfied -D, --chdir=path change working directory --get-user-env used by Moab. See srun man page. --gid=group_id group ID to run job as (user root only) --gres=list required generic resources -H, --hold submit job in held state -I, --immediate[=secs] exit if resources not available in "secs" --jobid=id specify jobid to use -J, --job-name=jobname name of job -k, --no-kill do not kill job on node failure -K, --kill-command[=signal] signal to send terminating job -L, --licenses=names required license, comma separated -m, --distribution=type distribution method for processes to nodes (type = block|cyclic|arbitrary) --mail-type=type notify on state change: BEGIN, END, FAIL or ALL --mail-user=user who to send email notification for job state changes -n, --tasks=N number of processors required --nice[=value] decrease scheduling priority by value --no-bell do NOT ring the terminal bell --ntasks-per-node=n number of tasks to invoke on each node -N, --nodes=N number of nodes on which to run (N = min[-max]) -O, --overcommit overcommit resources --power=flags power management options --priority=value set the priority of the job to value --profile=value enable acct_gather_profile for detailed data value is all or none or any combination of energy, lustre, network or task -p, --partition=partition partition requested --qos=qos quality of service -Q, --quiet quiet mode (suppress informational messages) --reboot reboot compute nodes before starting job -s, --share share nodes with other jobs --sicp If specified, signifies job is to receive job id from the incluster reserve range. --signal=[B:]num[@time] send signal when time limit within time seconds --switches=max-switches{@max-time-to-wait} Optimum switches and max time to wait for optimum -S, --core-spec=cores count of reserved cores --thread-spec=threads count of reserved threads -t, --time=minutes time limit --time-min=minutes minimum time limit (if distinct) --uid=user_id user ID to run job as (user root only) -v, --verbose verbose mode (multiple -v's increase verbosity) --wckey=wckey wckey to run job under Constraint options: --contiguous demand a contiguous range of nodes -C, --constraint=list specify a list of constraints -F, --nodefile=filename request a specific list of hosts --mem=MB minimum amount of real memory --mincpus=n minimum number of logical processors (threads) per node --reservation=name allocate resources from named reservation --tmp=MB minimum amount of temporary disk -w, --nodelist=hosts... request a specific list of hosts -x, --exclude=hosts... exclude a specific list of hosts Consumable resources related options: --exclusive[=user] allocate nodes in exclusive mode when cpu consumable resource is enabled --mem-per-cpu=MB maximum amount of real memory per allocated cpu required by the job. --mem >= --mem-per-cpu if --mem is specified. Affinity/Multi-core options: (when the task/affinity plugin is enabled) -B --extra-node-info=S[:C[:T]] Expands to: --sockets-per-node=S number of sockets per node to allocate --cores-per-socket=C number of cores per socket to allocate --threads-per-core=T number of threads per core to allocate each field can be 'min' or wildcard '*' total cpus requested = (N x S x C x T) --ntasks-per-core=n number of tasks to invoke on each core --ntasks-per-socket=n number of tasks to invoke on each socket Help options: -h, --help show this help message -u, --usage display brief usage message Other options: -V, --version output version information and exit
Further Information