[slurm-users] Node suspend / Power saving - for *idle* nodes only?
kg4ydw at gmail.com
Fri May 15 14:20:33 UTC 2020
I've had slurm power off a few nodes I was working on...
My normal solution is to just power them back on without slurm's help.
Then it brings the node up in state "down / unexpectedly booted" and
it doesn't seem to mess with them until I use scontrol to change the
state again. (I like scontrol reboot nextstate=resume to do this).
And if you want to be extra sure, change things so your SuspendProgram
can't shut the node off while you are working on it.
On Thu, May 14, 2020 at 9:14 AM Florian Zillner <fzillner at lenovo.com> wrote:
> I'm experimenting with slurm's power saving feature and shutdown of "idle" nodes works in general, also the power up works when "idle~" nodes are requested.
> So far so good, but slurm is also shutting down nodes that are not explicitly "idle". Previously I drained a node to debug something on it and slurm shut it down when the SuspendTimeout was reached.
> Is this something I can configure or should the SuspendProgramm deal with this and ignore poweroff requests for non-idle nodes? I haven't found a setting for this, if there is one, please point me to it. Btw, we're on 18.08.8.
More information about the slurm-users