[slurm-users] Power saving
Benson Muite
benson_muite at emailplus.org
Thu Jul 28 18:56:17 UTC 2022
On 7/28/22 18:49, Djamil Lakhdar-Hamina wrote:
> I am helping set up a 16 node cluster computing system, I am not a
> system-admin but I work for a small firm and unfortunately have to pick
> up needed skills fast in things I have little experience in. I am
> running Rocky Linux 8 on Intel Xeon Knights Landings nodes donated by
> the TAAC center. We are operating in Uganda where we have limited
> resources and where power is quite expensive.
>
> What are some good ways to implement power-saving ? I have already tried
> power saving as per slurms power saving guide but 1) I am not quite sure
> what it does and 2) in implementing a version on my virtual dev
> environment I was able to get the power saving to stand down nodes, but
> I was not able to get the power saving mechanism to spin them back up
> when needed. I put power saving in the slurm.cfg file, and I also
> specified a SuspendProgram and a ResumeProgram similar to the one in the
> https://slurm.schedmd.com/power_save.html
> <https://slurm.schedmd.com/power_save.html>.
You might also look at Varorium:
https://variorum.readthedocs.io/en/latest/api/cap_functions.html
https://github.com/LLNL/variorum
>
> So 1) how do I get this power saving mechanism to work, what exactly
> will it do, I see it stands nodes down, will it spin them back up on
> request of those resources? 2) Are there any better techniques for power
> saving, say using IPMItool or something?
>
> Sincerely,
> Djamil Lakhdar-Hamina
More information about the slurm-users
mailing list