[slurm-users] unable to ssh onto compute nodes on which I have running jobs

Fulcomer, Samuel samuel_fulcomer at brown.edu
Wed Jul 27 18:43:26 UTC 2022


>From our /etc/pam.d/sshd on our compute nodes


account    required     pam_nologin.so
account    sufficient    pam_access.so
account    include      password-auth
-account    required      pam_slurm_adopt.so


....and /pam.d/password-auth:

#-session     optional      pam_systemd.so

Note that disabling pam_systemd is necessary to have the ssh login properly
fenced by cgroups.


On Wed, Jul 27, 2022 at 1:35 PM byron <lbgpublic at gmail.com> wrote:

> This happens on all our compute nodes.
>
> I can't find any mention of slurm_pam_adopt in /etc/pamd.d.  All I have is
> in sshd, account required pam_slurm.so.
>
> On Wed, Jul 27, 2022 at 5:52 PM Brian Andrus <toomuchit at gmail.com> wrote:
>
>> Lloyd,
>>
>> You could  check out the order of entries in your pam.d/ssh (and
>> related/included) files
>>
>> See where the slurm_pam_adopt is, how it is being called and if there are
>> settings that are interferring.
>>
>> Does this occur only on a single node, or all of them?
>>
>> Brian Andrus
>> On 7/27/2022 9:29 AM, Lloyd Goodman wrote:
>>
>> I don't think that's the source of the problem.  All our user accounts
>> are centrally managed using sssd.
>>
>> And just to be sure I run "getent passwd <username>" on the management,
>> head and compute nodes and they all returned the same values
>>
>> On Wed, 27 Jul 2022 at 17:22, Brian Andrus <toomuchit at gmail.com> wrote:
>>
>>> Verify that their uid on the node is the same as the uid your master sees
>>>
>>> Brian Andrus
>>>
>>>
>>> On 7/27/2022 8:53 AM, byron wrote:
>>> > Hi
>>> >
>>> > When a user tries to login into a compute node on which they have a
>>> > running job they get the error
>>> >
>>> > Access denied: user blahblah (uid=3333) has no active jobs on this
>>> node.
>>> > Authentication failed.
>>> >
>>> > I recently upgraded slurm to 20.11.9 and was under the impression that
>>> > prior to the upgrade they were able to ssh into nodes where they had
>>> > running jobs, but its entirely possible that I'm mistaken.
>>> >
>>> > Either way, can some explain how to enable that behaviour please.
>>> >
>>> > Thanks
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>
>> --
>> *Lloyd Goodman* // HPC Systems Administrator
>>
>> *e: *lloyd.goodman at cfms.org.uk     *w: *www.cfms.org.uk
>> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons
>> Green // Bristol // BS16 7FR
>>
>> <https://cfms.org.uk/media/384108/email-footer-darren-swift.png?width=500&height=115&mode=crop>
>> <https://cfms.org.uk/media/384108/email-footer-darren-swift.png?width=500&height=115&mode=crop>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220727/227b558b/attachment.htm>


More information about the slurm-users mailing list