[slurm-users] Power Save: When is RESUME an invalid node state?
Stefan Staeglich
staeglis at informatik.uni-freiburg.de
Thu Dec 7 14:46:49 UTC 2023
Hi Xaver,
we also had a similar problem with Slurm 21.08 (see thread "error: power_save
module disabled, NULL SuspendProgram").
Fortunately, we have not yet observed this since the upgrade to 23.02. But the
time period (about a month) is still too short to know if the problem is
really fixed as we are still in the normal recurrence period of that event.
Best regards,
Stefan
Am Mittwoch, 6. Dezember 2023, 12:14:46 CET schrieb Xaver Stiensmeier:
> Hi Ole,
>
> for multiple reasons we build it ourself, but I am not really involved
> in that process, but I will contact the person who is. Thanks for the
> recommendation! We should probably implement a regular check whether
> there is a new slurm version. I am not 100% whether this will fix our
> issues or not, but it's worth a try.
>
> Best regards
> Xaver
>
> On 06.12.23 12:03, Ole Holm Nielsen wrote:
> > On 12/6/23 11:51, Xaver Stiensmeier wrote:
> >> Good idea. Here's our current version:
> >>
> >> ```
> >> sinfo -V
> >> slurm 22.05.7
> >> ```
> >>
> >> Quick googling told me that the latest version is 23.11. Does the
> >> upgrade change anything in that regard? I will keep reading.
> >
> > There are nice bug fixes in 23.02 mentioned in my SLUG'23 talk "Saving
> > Power with Slurm" at https://slurm.schedmd.com/publications.html
> >
> > For reasons of security and functionality it is recommended to follow
> > Slurm's releases (maybe not the first few minor versions of new major
> > releases like 23.11). FYI, I've collected information about upgrading
> > Slurm in the Wiki page
> > https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_installation/#upgrading-sl
> > urm
> >
> > /Ole
--
Albert-Ludwigs-Universität Freiburg
Institut für Informatik
Professur für Maschinelles Lernen
Stefan Stäglich
System-Administrator
T +49 761 203-8223
staeglis at informatik.uni-freiburg.de
https://ml.informatik.uni-freiburg.de
Georges-Köhler-Allee 52
D-79110 Freiburg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5615 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231207/ac1c5a25/attachment.bin>
More information about the slurm-users
mailing list