[slurm-users] Can't start slurmdbd

Lachlan Musicman datakid at gmail.com
Mon Nov 20 04:11:36 MST 2017


On 20 November 2017 at 20:50, Juan A. Cordero Varelaq <
bioinformatica-ibis at us.es> wrote:

>     $ systemctl start slurmdbd
>     Job for slurmdbd.service failed because the control process exited
> with error code. See "systemctl status slurmdbd.service" and "journalctl
> -xe" for details.
>     $ systemctl status slurmdbd.service
>     ‚óŹ slurmdbd.service - Slurm DBD accounting daemon
>        Loaded: loaded (/etc/systemd/system/slurmdbd.service; enabled;
> vendor preset: disabled)
>        Active: failed (Result: exit-code) since lun 2017-11-20 10:39:26
> CET; 53s ago
>       Process: 27592 ExecStart=/usr/sbin/slurmdbd $SLURMDBD_OPTIONS
> (code=exited, status=1/FAILURE)
>
>     nov 20 10:39:26 login_node systemd[1]: Starting Slurm DBD accounting
> daemon...
>     nov 20 10:39:26 login_node systemd[1]: slurmdbd.service: control
> process exited, code=exited status=1
>     nov 20 10:39:26 login_node systemd[1]: Failed to start Slurm DBD
> accounting daemon.
>     nov 20 10:39:26 login_node systemd[1]: Unit slurmdbd.service entered
> failed state.
>     nov 20 10:39:26 login_node systemd[1]: slurmdbd.service failed.
>     $ journalctl -xe
>     nov 20 10:39:26 login_node polkitd[1078]: Registered Authentication
> Agent for unix-process:27586:119889015 (system bus name :1.871
> [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /or
>     nov 20 10:39:26 login_node systemd[1]: Starting Slurm DBD accounting
> daemon...
>     -- Subject: Unit slurmdbd.service has begun start-up
>     -- Defined-By: systemd
>     -- Support: http://lists.freedesktop.org/m
> ailman/listinfo/systemd-devel
>     --
>     -- Unit slurmdbd.service has begun starting up.
>     nov 20 10:39:26 login_node systemd[1]: slurmdbd.service: control
> process exited, code=exited status=1
>     nov 20 10:39:26 login_node systemd[1]: Failed to start Slurm DBD
> accounting daemon.
>     -- Subject: Unit slurmdbd.service has failed
>     -- Defined-By: systemd
>     -- Support: http://lists.freedesktop.org/m
> ailman/listinfo/systemd-devel
>     --
>     -- Unit slurmdbd.service has failed.
>     --
>     -- The result is failed.
>     nov 20 10:39:26 login_node systemd[1]: Unit slurmdbd.service entered
> failed state.
>     nov 20 10:39:26 login_node systemd[1]: slurmdbd.service failed.
>     nov 20 10:39:26 login_node polkitd[1078]: Unregistered Authentication
> Agent for unix-process:27586:119889015 (system bus name :1.871, object path
> /org/freedesktop/PolicyKit1/AuthenticationAgent,
>     nov 20 10:40:06 login_node gmetad[1519]: data_thread() for [HPCSIE]
> failed to contact node 192.168.2.10
>     nov 20 10:40:06 login_node gmetad[1519]: data_thread() got no answer
> from any [HPCSIE] datasource
>     nov 20 10:40:13 login_node dhcpd[2320]: DHCPREQUEST for 192.168.2.19
> from XX:XX:XX:XX:XX:XX via enp6s0f1
>     nov 20 10:40:13 login_node dhcpd[2320]: DHCPACK on 192.168.2.19 to
> XX:XX:XX:XX:XX:XX via enp6s0f1
>     nov 20 10:40:39 login_node dhcpd[2320]: DHCPREQUEST for 192.168.2.13
> from XX:XX:XX:XX:XX:XX via enp6s0f1
>     nov 20 10:40:39 login_node dhcpd[2320]: DHCPACK on 192.168.2.13 to
> XX:XX:XX:XX:XX:XX via enp6s0f1
>
> I've just found out the file `/var/run/slurmdbd.pid` does not even exist.
>


The pid file is the "process id" - it's only there if the process is
running. So when slurmdbd is not running, it wont be there. Supposedly.
Sometimes I do "touch /var/run/slurmdbd.pid" and try again?

I've also found that using the host's short name is preferable to
localhost. Make sure the host's short name is in /etc/hosts too.

hostname -s

will give you the short name

Cheers
L.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171120/e7e7e186/attachment.html>


More information about the slurm-users mailing list