[slurm-users] Set a ramdom offset when starting node health check in SLURM

SJTU weijianwen at sjtu.edu.cn
Fri Nov 27 03:24:03 UTC 2020


Hi,

   We uses HealthCheckProgram = /usr/sbin/nhc in slurm to check node health every 600 seconds. However, some NHC checks points to a same central resource thus starting these checks simultaneously may lead to false alarms of service degrade.

   Is it possible  to set a random offset to when HealthCheckProgram starts? 


Thank you!

Jianwen


More information about the slurm-users mailing list