[slurm-users] Slurm powersave
Davide DelVento
davide.quantum at gmail.com
Mon Dec 11 16:52:10 UTC 2023
In case it's useful to others: I've been able to get this working by having
the "no action" script stop the slurmd daemon and start it *with the -b
option*.
On Fri, Oct 6, 2023 at 4:28 AM Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk>
wrote:
> Hi Davide,
>
> On 10/5/23 15:28, Davide DelVento wrote:
> > IMHO, "pretending" to power down nodes defies the logic of the Slurm
> > power_save plugin.
> >
> > And it is sure useless ;)
> > But I was using the suggestion from
> > https://slurm.schedmd.com/power_save.html
> > <https://slurm.schedmd.com/power_save.html> which says
> >
> > You can also configure Slurm with programs that perform no action as
> > *SuspendProgram* and *ResumeProgram* to assess the potential impact of
> > power saving mode before enabling it.
>
> I had not noticed the above sentence in the power_save manual before! So
> I decided to test a "no action" power saving script, similar to what you
> have done, applying it to a test partition. I conclude that "no action"
> power saving DOES NOT WORK, at least in Slurm 23.02.5. So I opened a bug
> report https://bugs.schedmd.com/show_bug.cgi?id=17848 to find out if the
> documentation is obsolete, or if there may be a bug. Please follow that
> bug to find out the answer from SchedMD.
>
> What I *believe* (but not with 100% certainty) really happens with power
> saving in the current Slurm versions is what I wrote yesterday:
>
> > Slurmctld expects suspended nodes to *really* power
> > down (slurmd is stopped). When slurmctld resumes a suspended node,
> it
> > expects slurmd to start up when the node is powered on. There is a
> > ResumeTimeout parameter which I've set to about 15-30 minutes in
> case of
> > delays due to BIOS updates and the like - the default of 60 seconds
> is
> > WAY too small!
>
> I hope this helps,
> Ole
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231211/fd92ad30/attachment.htm>
More information about the slurm-users
mailing list