[slurm-users] slurmrestd service broken by 22.05.07 update
Chris Stackpole
cstackpole at advancedclustering.com
Wed Dec 28 23:20:01 UTC 2022
Greetings,
After updating to 22.05.07 (manually built from source)...
slurmrestd[68695]: fatal: _create_socket:
[unix:/var/lib/slurmrestd.socket] Unable to bind UNIX socket: Permission
denied
Looking at release notes:
> -- slurmrestd - switch users earlier on startup to avoid sockets being
> made as root.
OK... So...
$ cat /usr/lib/systemd/system/slurmrestd
<snip>
# slurmrestd should not run as root or the slurm user.
# Please either use the -u and -g options in /etc/sysconfig/slurmrestd
<snip>
# Default to listen on both socket and slurmrestd port
ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS
unix:/var/lib/slurmrestd.socket 0.0.0.0:6820
<snip>
$ cat /etc/sysconfig/slurmrestd
SLURMRESTD_OPTIONS="-v -u slurmrestd"
$ ls -ld /var/lib
drwxr-xr-x. 74 root root 4096 Dec 21 10:55 /var/lib
So then - if I understand correctly, the process needs to run as a
non-root user but has to write into a root-only directory to create the
unix socket and the only way to adjust this is by editing the service
file (which is almost certainly going to be forgotten about by me and
replaced by the next version update) and there's no documentation I can
find about how we are to configure this chicken-and-egg situation?
Hrm... ¯\_ (ツ)_/¯
I'm not sure if this is a logic bug or if I'm just doing things wrong.
And for note, yes, if I create /var/lib/slurmrestd/, give ownership to
slurmrestd user, update the service file, systemctl daemon-reload, and
then start the service - it works. But by default, from a fresh build,
the process is broken and I haven't found documentation on the
recommended method to solve this issue.
Thoughts anyone? Please tell me where I'm going wrong on this.
Thanks!
More information about the slurm-users
mailing list