[slurm-users] monitoring and update regime for Power Saving nodes
David Simpson
SimpsonD4 at cardiff.ac.uk
Wed Feb 23 10:31:11 UTC 2022
Hi all,
Interested to know what common approaches were to:
* Monitoring of power saving nodes (e.g. health of the node), when potentially the monitoring system will see it go up and down. Do you limit to BMC only monitoring/health?
* When you want to make changes to slurm.conf (or anything else) to a node which is down due to power saving (during a maintenance/reservation) what is your approach? Do you end up with 2 slurm.confs (one for power saving and one that keeps everything up, to work on during the maintenance)?
thanks
David
-------------
David Simpson - Senior Systems Engineer
ARCCA, Redwood Building,
King Edward VII Avenue,
Cardiff, CF10 3NB
David Simpson - peiriannydd uwch systemau
ARCCA, Adeilad Redwood,
King Edward VII Avenue,
Caerdydd, CF10 3NB
simpsond4 at cardiff.ac.uk<mailto:simpsond4 at cardiff.ac.uk>
+44 29208 74657
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220223/f1b87d8e/attachment-0001.htm>
More information about the slurm-users
mailing list