[slurm-users] NHC and slurm
heitorpbittencourt at gmail.com
Thu Apr 15 13:58:31 UTC 2021
I'm trying to setup NHC for our Slurm cluster, but I'm not getting
it to work properly.
I'm using the dev branch from  and compiled it this way:
$ ./autogen.sh --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib
$ make test
$ sudo make install
When I run nhc, I get an error that sshd is not running:
$ sudo nhc
ERROR: nhc: Health check failed: check_ps_service: Service sshd (process sshd) owned by root not running
I know sshd is running because I logged in this machine with ssh. And
`systemctl status sshd` shows it is active.
Here's a sample of my nhc.conf:
* || check_ps_service munged
* || check_ps_service -u root sshd
* || check_ps_service -u root ssh
* || check_ps_service ssh
* || check_ps_service sshd
If I run `sudo nhc -a` to run all the tests, it gives 4 errors about
NHC can find munge running, so what's the problem with ssh? What am I
I'm using Ubuntu 20.04.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 488 bytes
Desc: OpenPGP digital signature
More information about the slurm-users