Slurm Workload Manager

Usage

Check Cluster State

sinfo

Example
[carlsonc@aihst-login ~]$ sinfo
PARTITION     AVAIL  TIMELIMIT  NODES  STATE NODELIST
NICT             up 30-00:00:0      1  down* aixl675dn01
NICT             up 30-00:00:0      3   idle aixl675dn[02,08-09]
short_all*       up    4:00:00      6   idle aixl645dn[02-04],aixl675dn[03-05]
short_tusken     up    4:00:00      3   idle aixl675dn[03-05]
normal_tusken    up 1-00:00:00      3   idle aixl675dn[03-05]
long_tusken      up 7-00:00:00      3   idle aixl675dn[03-05]
short_bantha     up    4:00:00      3   idle aixl645dn[02-04]
normal_bantha    up 1-00:00:00      3   idle aixl645dn[02-04]
long_bantha      up 7-00:00:00      3   idle aixl645dn[02-04]

Allocate a Node to Partition

salloc

Example
[carlsonc@aihst-login ~]$ salloc -p short_tusken --nodelist=aixl675dn04
salloc: Granted job allocation 2942

View Allocations

squeue

Example
[carlsonc@aihst-login ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
              2942 short_tus interact carlsonc  R       0:17      1 aixl675dn04

Free a Node from Allocation

scancel <job_id>

Use squeue to get the job ID for your node.

Example
[carlsonc@aixl675dn04 gdsio]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
              2942 short_tus interact carlsonc  R    1:22:29      1 aixl675dn04
[carlsonc@aixl675dn04 gdsio]$ scancel 2942
salloc: Job allocation 2942 has been revoked.