[slurm-users] Set a ramdom offset when starting node health check in SLURM
SJTU
weijianwen at sjtu.edu.cn
Fri Nov 27 03:24:03 UTC 2020
Hi,
We uses HealthCheckProgram = /usr/sbin/nhc in slurm to check node health every 600 seconds. However, some NHC checks points to a same central resource thus starting these checks simultaneously may lead to false alarms of service degrade.
Is it possible to set a random offset to when HealthCheckProgram starts?
Thank you!
Jianwen
More information about the slurm-users
mailing list