[slurm-users] Power saving

Benson Muite benson_muite at emailplus.org
Thu Jul 28 18:56:17 UTC 2022


On 7/28/22 18:49, Djamil Lakhdar-Hamina wrote:
> I am helping set up a 16 node cluster computing system, I am not a 
> system-admin but I work for a small firm and unfortunately have to pick 
> up needed skills fast in things I have little experience in. I am 
> running Rocky Linux 8 on Intel Xeon Knights Landings nodes donated by 
> the TAAC center. We are operating in Uganda where we have limited 
> resources and where power is quite expensive.
> 
> What are some good ways to implement power-saving ? I have already tried 
> power saving as per slurms power saving guide but 1) I am not quite sure 
> what it does and 2) in implementing a version on my virtual dev 
> environment I was able to get the power saving to stand down nodes, but 
> I was not able to get the power saving mechanism to spin them back up 
> when needed. I put power saving in the slurm.cfg file, and I also 
> specified a SuspendProgram and a ResumeProgram similar to the one in the 
> https://slurm.schedmd.com/power_save.html 
> <https://slurm.schedmd.com/power_save.html>.
You might also look at Varorium:
https://variorum.readthedocs.io/en/latest/api/cap_functions.html
https://github.com/LLNL/variorum
> 
> So 1) how do I get this power saving mechanism to work, what exactly 
> will it do, I see it stands nodes down, will it spin them back up on 
> request of those resources? 2) Are there any better techniques for power 
> saving, say using IPMItool or something?
> 
> Sincerely,
> Djamil Lakhdar-Hamina




More information about the slurm-users mailing list