[slurm-users] Rolling upgrade of compute nodes

Christopher Samuel chris at csamuel.org
Mon May 30 06:25:55 UTC 2022


On 5/29/22 3:09 pm, byron wrote:

>   This is the first time I've done an upgrade of slurm and I had been 
> hoping to do a rolling upgrade as opposed to waiting for all the jobs to 
> finish on all the compute nodes and then switching across but I dont see 
> how I can do it with this setup.  Does any one have any expereience of this?

We do rolling upgrades with:

scontrol reboot ASAP nextstate=resume reason="some-useful-reason" 
[list-of-nodes]

But you do need to have RebootProgram defined and an appropriate 
ResumeTimeout set to allow enough time for your node to reboot (and of 
course your system must be configured to boot into a production ready 
state when rebooted, including starting up slurmd).

All the best,
Chris
-- 
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



More information about the slurm-users mailing list