[slurm-users] Cancel "reboot ASAP" for a node

Hanby, Mike mhanby at uab.edu
Fri Aug 7 15:43:09 UTC 2020


Howdy, (Slurm 18.08)

We have a bunch of node that we've updated to "scontrol reboot ASAP".

We'd like to cancel a few of those. From the man page, it's suggested that either of the following should work, however both report the same error " slurm_update error: Invalid node state specified":

scontrol cancel_reboot c01
or
scontrol Update NodeName=c01 State=CANCEL_REBOOT

Here's the 'scontrol show node c01' info for reference:

NodeName=c01 Arch=x86_64 CoresPerSocket=12
   CPUAlloc=7 CPUTot=24 CPULoad=7.04
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=c0115 NodeHostName=c01 Version=18.08
   OS=Linux 3.10.0-1062.9.1.el7.x86_64 #1 SMP Mon Dec 2 08:31:54 EST 2019
   RealMemory=191877 AllocMem=6536 FreeMem=176717 Sockets=2 Boards=1
   State=MIXED+DRAIN ThreadsPerCore=1 TmpDisk=887366 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=interactive,short,long,medium,express
   BootTime=2020-07-08T23:16:27 SlurmdStartTime=2020-07-08T23:32:05
   CfgTRES=cpu=24,mem=191877M,billing=24
   AllocTRES=cpu=7,mem=6536M
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
   Reason=Reboot ASAP [root at 2020-08-06T10:29:22]

Any thoughts as to how to cancel the reboot?

----------------
Mike Hanby
mhanby @ uab.edu
Systems Analyst III - Enterprise
IT Research Computing Services
The University of Alabama at Birmingham
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200807/c108a4f2/attachment.htm>


More information about the slurm-users mailing list