[slurm-users] Rolling reboot with at most N machines down simultaneously?
David Simpson
SimpsonD4 at cardiff.ac.uk
Thu Aug 4 16:03:28 UTC 2022
Another way might be to implement slurm power off/on (if not already) and induce it as required.
-------------
David Simpson - Senior Systems Engineer
ARCCA, Redwood Building,
King Edward VII Avenue,
Cardiff, CF10 3NB
David Simpson - peiriannydd uwch systemau
ARCCA, Adeilad Redwood,
King Edward VII Avenue,
Caerdydd, CF10 3NB
-----Original Message-----
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Brian Andrus
Sent: 04 August 2022 14:47
To: slurm-users at lists.schedmd.com
Subject: Re: [slurm-users] Rolling reboot with at most N machines down simultaneously?
External email to Cardiff University - Take care when replying/opening attachments or links.
Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb/agor atodiadau neu ddolenni.
This is actually brilliant!
Brian Andrus
On 8/3/2022 10:20 PM, Gerhard Strangar wrote:
> Phil Chiu wrote:
>
>> - Individual slurm jobs which reboot nodes - With a for loop, I could
>> submit a reboot job for each node. But I'm not sure how to limit this so at
>> most N jobs are running simultaneously.
> With a fake license called reboot?
>
More information about the slurm-users
mailing list