[slurm-users] pam_slurm_adopt seems not working properly under "configless" slurm mode
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue Apr 21 08:44:09 UTC 2020
On 21-04-2020 04:58, Haoyang Liu wrote:
> I am setting up the latest slurm-20.02-1 on my clusters and trying to configure the "configless" slurm on the compute nodes.
> After following the instructions from https://slurm.schedmd.com/configless_slurm.html, both slurmctld and slurmd works fine.
> The config files can be found at $SlurmdSpoolDir/conf-cache and /run/slurm/conf. However, when I try to ssh into some compute
> node, say `comput6`,
>
> $ ssh comput6
>
> the prompt will be stuck for ~one minute and finally returns 'No Slurm jobs found on node'. Previously it should be
> 'Access denied by pam_slurm_adopt: you have no active jobs on this node'.
>
> The issue can be reproduced on centos 6 and 7. I've checked /var/log/secure and noticed the following output:
>
> comput6 pam_slurm_adopt[43672]: error: s_p_parse_file: unable to status file /usr/local/slurm/etc/slurm.conf: No such file or directory, retrying in 1sec up to 60sec
>
> It seems that pam_slurm_adopt is still trying to find the config file in the default directory under the "configless" mode.
> Creating a symlink in /usr/local/slurm/etc seems to be a workaround, but it seems moving away from the "configless" slurm.
>
> Is there a better way to fix this?
This issue has been reported previously by others, and there is a recent
bug report https://bugs.schedmd.com/show_bug.cgi?id=8712 which you could
follow for updates.
Probably the issue needs to be reported by a customer with a SchedMD
support contract before a solution can be expected.
/Ole
More information about the slurm-users
mailing list