[slurm-users] pam_slurm_adopt always claims now active jobs even when they do

Juergen Salk juergen.salk at uni-ulm.de
Sat Oct 24 07:43:12 UTC 2020


Hi Paul,

maybe this is totally unrelated but we also have a similar issue with
pam_slurm_adopt in case that ConstrainRAMSpace=no is set in
cgroup.conf and more than one job is running on that node. There is a
bug report open at:

  https://bugs.schedmd.com/show_bug.cgi?id=9355

As a workaround we currently advise users to not use ssh but attach an
interactive shell under an already allocated job by running the
following command: 

  srun --jobid <job> --pty /bin/bash

For a single node job the user does not even need to know the node
that the job is running on. For a multinode job, the user can still
use '-w <node>' option to specify a specific node.

Best regards
Jürgen

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471



* Paul Raines <raines at nmr.mgh.harvard.edu> [201023 13:13]:
> 
> I am running Slurm 20.02.3 on CentOS 7 systems.  I have pam_slurm_adopt
> setup in /etc/pam.d/system-auth and slurm.conf has PrologFlags=Contain,X11
> I also have masked systemd-logind
> 
> But pam_slurm_adopt always denies login with "Access denied by
> pam_slurm_adopt: you have no active jobs on this node" even when the
> user most definitely has a job running on the node via srun
> 
> Any clues as to why pam_slurm_adopt thinks there is no job?
> 
> serena [raines] squeue
>              JOBID PARTITION     NAME     USER ST       TIME  NODES
> NODELIST(REASON)
>                785    lcnrtx     tcsh   raines  R   19:44:51      1 rtx-03
> serena [raines] ssh rtx-03
> Access denied by pam_slurm_adopt: you have no active jobs on this node
> Authentication failed.
> 
> 

-- 
GPG A997BA7A | 87FC DA31 5F00 C885 0DC3  E28F BD0D 4B33 A997 BA7A



More information about the slurm-users mailing list