[slurm-users] How to run one maintenance job on each node in the cluster
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Sat Dec 23 10:20:25 UTC 2023
On 23-12-2023 05:09, Jeffrey Tunison wrote:
> Is there a straightforward way to create a batch job that runs once on
> every node in the cluster?
>
> A technique simpler than generating a list from sinfo output and
> dispatching the job in a for loop for the N nodes.
>
> That’s not very hard, but I thought there might be an elegant solution
> which would make dispatching maintenance jobs easier.
One solution is the method in this script:
https://github.com/OleHolmNielsen/Slurm_tools/blob/master/nodes/update.sh
This works very reliably for us when we need to apply OS or firmware
updates.
> SLURM 22.05.09
Note: You should apply the recent Slurm security updates ASAP!
/Ole
More information about the slurm-users
mailing list