[slurm-users] SLURM PAM support?
Yair Yarom
irush at cs.huji.ac.il
Mon Jun 18 08:53:22 MDT 2018
Hi,
We encountered this issue some time ago (see:
https://www.mail-archive.com/slurm-dev@schedmd.com/msg06628.html). You
need to add pam_systemd to the slurm pam file, but pam_systemd will
try to take over the slurm's cgroups. Our current solution is to add
pam_systemd to the slurm pam file, but in addition to save/restore the
slurm cgroup locations. It's not pretty, but for now it works...
If you don't constrain the devices (i.e. don't have GPUs), you
probably can do without the pam_exec script and use the pam_systemd
normally.
We're using debian, but the basics should be the same. I've placed the
script in github, if you want to try it:
https://github.com/irush-cs/slurm-scripts
Yair.
On Mon, Jun 18, 2018 at 3:33 PM, John Hearns <hearnsj at googlemail.com> wrote:
> Your problem is that you are listening to Lennart Poettering...
> I cannot answer your question directly. However I am doing work at the
> moment with PAM and sssd.
> Have a look at the directory which contains the unit files. Go on
> /lib/systemd/sysem
> See that nice file named -.slice Yes that file is absolutely needed, it
> is not line noise.
> Now try to grep on the files in that directory, since you might want to
> create a new systemd unit file based on an existing one.
>
> Yes, a regexp guru will point out that this is trivial. But to me creating
> files that look like -.slice is putting your head in the lion's mouth.
>
>
>
>
>
> On 18 June 2018 at 14:15, Maik Schmidt <maik.schmidt at tu-dresden.de> wrote:
>>
>> Hi,
>>
>> we're currently in the process of migrating from RHEL6 to 7, which also
>> brings us the benefit of having systemd. However, we are observing problems
>> with user applications that use e.g. XDG_RUNTIME_DIR, because SLURM
>> apparently does not really run the user application through the PAM stack.
>> The consequence is that SLURM jobs inherit the XDG_* environment variables
>> from the login nodes (where sshd properly sets it up), but on the compute
>> nodes, /run/user/$uid does not exist, leading to errors whenever a user
>> application tries to access it.
>>
>> We have tried setting UsePam=1, but that did not help.
>>
>> I have found the following issue on the systemd project regarding exactly
>> this problem: https://github.com/systemd/systemd/issues/3355
>>
>> There, Lennart Poettering argues that it should be the responsibility of
>> the scheduler software (i.e. SLURM) to run user code only within a proper
>> PAM session.
>>
>> My question: does SLURM support this? If yes, how?
>>
>> If not, what are best practices to circumvent this problem on
>> RHEL7/systemd installations? Surely other clusters must have already had the
>> same issue...
>>
>> Thanks in advance.
>>
>> --
>> Maik Schmidt
>> HPC Services
>>
>> Technische Universität Dresden
>> Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
>> Willers-Bau A116
>> D-01062 Dresden
>> Telefon: +49 351 463-32836
>>
>>
>
More information about the slurm-users
mailing list