[slurm-users] Rolling upgrade of compute nodes
Christopher Samuel
chris at csamuel.org
Mon May 30 06:25:55 UTC 2022
On 5/29/22 3:09 pm, byron wrote:
> This is the first time I've done an upgrade of slurm and I had been
> hoping to do a rolling upgrade as opposed to waiting for all the jobs to
> finish on all the compute nodes and then switching across but I dont see
> how I can do it with this setup. Does any one have any expereience of this?
We do rolling upgrades with:
scontrol reboot ASAP nextstate=resume reason="some-useful-reason"
[list-of-nodes]
But you do need to have RebootProgram defined and an appropriate
ResumeTimeout set to allow enough time for your node to reboot (and of
course your system must be configured to boot into a production ready
state when rebooted, including starting up slurmd).
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list