[slurm-users] How to run one maintenance job on each node in the cluster

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Sat Dec 23 10:20:25 UTC 2023


On 23-12-2023 05:09, Jeffrey Tunison wrote:
> Is there a straightforward way to create a batch job that runs once on 
> every node in the cluster?
> 
> A technique simpler than generating a list from sinfo output and 
> dispatching the job in a for loop for the N nodes.
> 
> That’s not very hard, but I thought there might be an elegant solution 
> which would make dispatching maintenance jobs easier.

One solution is the method in this script:
https://github.com/OleHolmNielsen/Slurm_tools/blob/master/nodes/update.sh

This works very reliably for us when we need to apply OS or firmware 
updates.

> SLURM 22.05.09

Note: You should apply the recent Slurm security updates ASAP!

/Ole



More information about the slurm-users mailing list