Great. I will check that out.
-Paul Edmon-
On 8/19/25 11:45 AM, Otto, Frank wrote:
Hi Paul,
the dev branch of NHC is more up to date (though also 7 months stale now) and we are running this on RHEL9.6 with Slurm 24.11. Admittedly, we haven't been running it very long yet, so there might be issues we just haven't encountered yet, but in general it seems to work.
Kind regards, Frank
-- Dr. Frank Otto Principal Research Infrastructure Developer UCL Advanced Research Computing Centre Tel: 020 7679 1506
*From:* Paul Edmon via slurm-users slurm-users@lists.schedmd.com *Sent:* 19 August 2025 15:20 *To:* slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com *Subject:* [slurm-users] Node Health Check Program ⚠ Caution: External sender
We've been using NHC (https://github.com/mej/nhc) for years with much success. However that project hasn't had a release in 2 years and the various Issues filed indicate that there might be problems with Rocky 9 (which we are looking to upgrade to). Do people that are at EL9 use NHC? Is there a fork? Is there a different code that people use for doing node health checks?
-Paul Edmon-
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com