[slurm-users] slurmrestd service broken by 22.05.07 update
Chris Stackpole
cstackpole at advancedclustering.com
Thu Dec 29 14:49:58 UTC 2022
Greetings,
Thanks for responding!
On 12/28/22 20:35, Brian Andrus wrote:
> I suspect if you delete /var/lib/slurmrestd.socket and then start
> slurmrestd, it will create it as the user you need it to be.
>
> Or just change the owner of it to the slurmrestd owner.
No go on that. Because /var/lib requires root to create
/var/lib/slurmrestd.socket . Which is what I meant by "has to write into
a root-only directory to create the unix socket".
Here, I'll show what happens with me.
Spun up a virtual machine with nothing changed on a fresh compile of
22.05.07.
# rm -rf /var/lib/slurmrestd.socket
# systemctl start slurmrestd
# systemctl status slurmrestd
<snip>
Active: failed (Result: exit-code) since Thu 2022-12-29 08:39:45 CST;
54s ago
<snip>
# journalctl -xe
<snip>
Dec 29 08:39:45 testslurmvm.cluster slurmrestd[114317]: fatal:
_create_socket: [unix:/var/lib/slurmrestd.socket] Unable to bind UNIX
socket: Permission denied
Dec 29 08:39:45 testslurmvm.cluster systemd[1]: slurmrestd.service: Main
process exited, code=exited, status=1/FAILURE
Now what about giving ownership to the user?
# touch /var/lib/slurmrestd.socket
# systemctl start slurmrestd
# systemctl status slurmrestd
<snip>
Active: failed (Result: exit-code) since Thu 2022-12-29 08:45:37 CST;
1min 2s ago
<snip>
# journalctl -xe
<snip>
Dec 29 08:45:37 testslurmvm.cluster slurmrestd[114402]: error: Error
unlink(/var/lib/slurmrestd.socket): Permission denied
Dec 29 08:45:37 testslurmvm.cluster slurmrestd[114402]: fatal:
_create_socket: [unix:/var/lib/slurmrestd.socket] Unable to bind UNIX
socket: Address already in use
Again, it doesn't have permissions to modify those files nor create
files inside that directory.
On 12/28/22 20:35, Brian Andrus wrote:
> I have been running slurmrestd as a separate user for some time.
Under 22.05.07? Because that's what broke things for me. And I think
that it's this change:
| -- slurmrestd - switch users earlier on startup to avoid sockets being
| made as root.
I'm not saying it's a bad change either - but I don't see any
documentation on the proper way to handle it and I don't feel like
editing the service file is the proper way to handle it.
Thanks!
More information about the slurm-users
mailing list