Looks like the slurm user does not exist on the system. Did you run the slurmctld and slurmdbd before as root ? If you remove the two lines (User, Group), the services will start. But is is recommended to create a dedicated slurm user for that: https://slurm.schedmd.com/quickstart_admin.html#daemons
On Fri, Jan 19, 2024, 16:02 Miriam Olmi miriam.olmi@lngs.infn.it wrote:
Hi all,
I am having some issue with the new version of slurm 23.11.0-1.
I had already installed and configured slurm 23.02.3-1 on my cluster and all the services were active and running properly.
After I install with the same procedure the new version of slurm I have that the slurmctld and slurmdbd daemons fail to start all with the same error:
(code=exited, status=217/USER)
And investigating the problem with the command journalctl -xe I find:
slurmctld.service: Failed to determine user credentials: No such process slurmctld.service: Failed at step USER spawning /usr/sbin/slurmctld: No such process
I had a look at the slurmctld.service file for both the slurm versions and I found the following differences in the [Service] section.
From the slurmctld.service file of slurm 23.02.3-1:
[Service] Type=simple EnvironmentFile=-/etc/sysconfig/slurmctld EnvironmentFile=-/etc/default/slurmctld ExecStart=/usr/sbin/slurmctld -D -s $SLURMCTLD_OPTIONS ExecReload=/bin/kill -HUP $MAINPID LimitNOFILE=65536 TasksMax=infinity
From the slurmctld.service file of slurm 23.11.0-1:
[Service] Type=notify EnvironmentFile=-/etc/sysconfig/slurmctld EnvironmentFile=-/etc/default/slurmctld User=slurm Group=slurm ExecStart=/usr/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS ExecReload=/bin/kill -HUP $MAINPID LimitNOFILE=65536 TasksMax=infinity
I think the presence of the new lines regarding the slurm user might be the problem but I am not sure and I have no idea how to solve it.
Can anyone halp me?
Thanks in advance, Miriam