[slurm-users] pam_slurm_adopt does not constrain memory?

Kilian Cavalotti kilian.cavalotti.work at gmail.com
Wed Aug 22 09:58:06 MDT 2018


Hi Christian,

On Wed, Aug 22, 2018 at 7:27 AM, Christian Peter
<christian.peter at itwm.fraunhofer.de> wrote:
> we observed a strange behavior of pam_slurm_adopt regarding the involved
> cgroups:
>
> when we start a shell as a new Slurm job using "srun", the process has
> freezer, cpuset and memory cgroups setup as e.g.
> "/slurm/uid_5001/job_410318/step_0". that's good!
>
> however, another shell started by an SSH login is handled by
> pam_slurm_adopt. that process is only affected by the freezer and cpuset
> cgroups setup as "/slurm/uid_5001/job_410318/step_extern". it lacks the
> configuration of the "memory" cgroup. (see output below)

My guess is that you're experiencing first-hand the awesomeness of systemd.

The SSH session likely inherits the default user.slice systemd
cgroups, that take over and override the ones set by Slurm. So,
instead of inheriting the job's limit via the pam_slurm_adopt module,
your SSH shell gets the default systemd cgroup settings, which are
useless in your context.

I usually get rid of any reference to pam_systemd.so in /etc/pam.d/
and sometimes push it a bit further and delete
/lib*/security/pam_systemd.so, which inevitably brings a blissful grin
on my face.
You may want to take a look at the following bugs:
* https://bugs.schedmd.com/show_bug.cgi?id=3912
* https://bugs.schedmd.com/show_bug.cgi?id=3674
* https://bugs.schedmd.com/show_bug.cgi?id=3158
but they all boil down to the same conclusion.

Cheers,
-- 
Kilian



More information about the slurm-users mailing list