Hi,

Already done as part of the build process.

regards

Steven 



From: Williams, Jenny Avis <jenny_williams@unc.edu>
Sent: Friday, 31 January 2025 9:06 am
To: Steven Jones <steven.jones@vuw.ac.nz>; John Hearns <hearnsj@gmail.com>
Cc: slurm-users@schedmd.com <slurm-users@schedmd.com>
Subject: RE: [slurm-users] Re: RHEL8.10 V slurmctld
 
You don't often get email from jenny_williams@unc.edu. Learn why this is important

First I’d verify munge functionality in the updated environment –

 

https://github.com/dun/munge/wiki/Installation-Guide#troubleshooting

 

 

 

 

From: Steven Jones <steven.jones@vuw.ac.nz>
Sent: Thursday, January 30, 2025 2:55 PM
To: Williams, Jenny Avis <jenny_williams@unc.edu>; John Hearns <hearnsj@gmail.com>
Cc: slurm-users@schedmd.com
Subject: Re: [slurm-users] Re: RHEL8.10 V slurmctld

 

Hi,

Hmmm, yes I am using munge.

 

[root@node1 ~]# strings `which slurmd` |egrep -i munge

[root@node1 ~]#

 

Does not return anything on the nodes, but worked fine for RHEL9.5

[root@xxxunidrslurmd2 munge]# scontrol show config |egrep -i auth

AuthAltTypes            = (null)

AuthAltParameters       = (null)

AuthInfo                = (null)

AuthType                = auth/munge

[root@vuwunidrslurmd2 munge]#

Munge logs are 0 length

 

=============

 

slurmd2 slurm]# rpm -qi munge

Name        : munge

Version     : 0.5.13

Release     : 2.el8

Architecture: x86_64

Install Date: Wed 15 Jan 2025 02:11:46 AM UTC

Group       : Unspecified

Size        : 320124

License     : GPLv3+ and LGPLv3+

Signature   : RSA/SHA256, Mon 27 Apr 2020 11:43:24 PM UTC, Key ID 199e2f91fd431d51

Source RPM  : munge-0.5.13-2.el8.src.rpm

Build Date  : Fri 24 Apr 2020 07:37:08 AM UTC

Build Host  : x86-vm-02.build.eng.bos.redhat.com

Relocations : (not relocatable)

Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>

Vendor      : Red Hat, Inc.

URL         : https://dun.github.io/munge/

Summary     : Enables uid & gid authentication across a host cluster

Description :

MUNGE (MUNGE Uid 'N' Gid Emporium) is an authentication service for creating

and validating credentials. It is designed to be highly scalable for use

in an HPC cluster environment.

It allows a process to authenticate the UID and GID of another local or

remote process within a group of hosts having common users and groups.

These hosts form a security realm that is defined by a shared cryptographic

key. Clients within this security realm can create and validate credentials

without the use of root privileges, reserved ports, or platform-specific

methods.

 

==========

 

[root@node1 ~]# rpm -qi munge

Name        : munge

Version     : 0.5.13

Release     : 2.el8

Architecture: x86_64

Install Date: Sun Jan 26 22:14:28 2025

Group       : Unspecified

Size        : 319876

License     : GPLv3+ and LGPLv3+

Signature   : RSA/SHA256, Mon Apr 12 06:46:59 2021, Key ID 15af5dac6d745a60

Source RPM  : munge-0.5.13-2.el8.src.rpm

Build Date  : Mon Apr 12 05:07:29 2021

Build Host  : ord1-prod-x86build003.svc.aws.rockylinux.org

Relocations : (not relocatable)

Packager    : infrastructure@rockylinux.org

Vendor      : Rocky

URL         : https://dun.github.io/munge/

Summary     : Enables uid & gid authentication across a host cluster

Description :

MUNGE (MUNGE Uid 'N' Gid Emporium) is an authentication service for creating

and validating credentials. It is designed to be highly scalable for use

in an HPC cluster environment.

It allows a process to authenticate the UID and GID of another local or

remote process within a group of hosts having common users and groups.

These hosts form a security realm that is defined by a shared cryptographic

key. Clients within this security realm can create and validate credentials

without the use of root privileges, reserved ports, or platform-specific

methods.

[root@node1 ~]#

 

==========

 

[root@node1 ~]# rpm -qi slurm-slurmd

Name        : slurm-slurmd

Version     : 20.11.9

Release     : 1.el8

Architecture: x86_64

Install Date: Sun Jan 26 22:15:34 2025

Group       : Unspecified

Size        : 517922

License     : GPLv2 and BSD

Signature   : RSA/SHA256, Fri May  6 00:03:49 2022, Key ID 21ea45ab2f86d6a1

Source RPM  : slurm-20.11.9-1.el8.src.rpm

Build Date  : Thu May  5 23:33:30 2022

Build Host  : buildvm-x86-18.iad2.fedoraproject.org

Relocations : (not relocatable)

Packager    : Fedora Project

Vendor      : Fedora Project

URL         : https://slurm.schedmd.com/

Bug URL     : https://bugz.fedoraproject.org/slurm

Summary     : Slurm compute node daemon

Description :

Slurm compute node daemon. Used to launch jobs on compute nodes

[root@node1 ~]#

=======

 

slurmd2 ~]#  rpm -qi slurm-slurmctld

Name        : slurm-slurmctld

Version     : 20.11.9

Release     : 1.el8

Architecture: x86_64

Install Date: Wed 15 Jan 2025 02:11:48 AM UTC

Group       : Unspecified

Size        : 1097306

License     : GPLv2 and BSD

Signature   : RSA/SHA256, Fri 06 May 2022 12:03:49 AM UTC, Key ID 21ea45ab2f86d6a1

Source RPM  : slurm-20.11.9-1.el8.src.rpm

Build Date  : Thu 05 May 2022 11:33:30 PM UTC

Build Host  : buildvm-x86-18.iad2.fedoraproject.org

Relocations : (not relocatable)

Packager    : Fedora Project

Vendor      : Fedora Project

URL         : https://slurm.schedmd.com/

Bug URL     : https://bugz.fedoraproject.org/slurm

Summary     : Slurm controller daemon

Description :

Slurm controller daemon. Used to manage the job queue, schedule jobs,

and dispatch RPC messages to the slurmd processon the compute nodes

to launch jobs.

 

 

 

regards

Steven 

 


From: Williams, Jenny Avis <jenny_williams@unc.edu>
Sent: Friday, 31 January 2025 8:36 am
To: Steven Jones <
steven.jones@vuw.ac.nz>; John Hearns <hearnsj@gmail.com>
Cc: 
slurm-users@schedmd.com <slurm-users@schedmd.com>
Subject: RE: [slurm-users] Re: RHEL8.10 V slurmctld

 

You don't often get email from jenny_williams@unc.edu. Learn why this is important


On both a compute node and the controller

rpm -qi slurm-slurmctld

rpm -qi slurm-slurmd
check what the auth type is – for example, we still use munge, which in my compile is also the default auth type.  :

# strings `which slurmd` |egrep -i munge

DEFAULT_AUTH_TYPE "auth/munge"

DEFAULT_CRED_TYPE "cred/munge"

 


#scontrol show config |egrep -i auth

AuthAltTypes            = (null)

AuthAltParameters       = (null)

AuthInfo                = (null)

AuthType                = auth/munge

 

From: Steven Jones via slurm-users <slurm-users@lists.schedmd.com>
Sent: Thursday, January 30, 2025 2:07 PM
To: John Hearns <
hearnsj@gmail.com>
Cc: 
slurm-users@schedmd.com
Subject: [slurm-users] Re: RHEL8.10 V slurmctld

 

Hi,

Yes, even ssh works OK.    

[root@xxxunicobuildt1 warewulf]# ssh
xxxjonesst@xxx.ac.nz@node1

(xxxjonesst@xxx.ac.nz@node1) Password:

Last login: Wed Jan 29 01:26:21 2025 from 130.195.87.12

[xxxjonesst@xxx.ac.nz@node1 ~]$

 

xxxjonesst@xxx.ac.nz@node1 ~]$ whoami | id

uid=1204805830(xxxjonesst@xxx.ac.nz) gid=1204805830(xxxjonesst@xxx.ac.nz)

 

 

tail -f /var/log/secure

=========

 

Jan 30 18:19:56 node1 sshd[15443]: pam_sss(sshd:auth): authentication success; logname= uid=0 euid=0 tty=ssh ruser= rhost=130.195.87.12 user=xxxjonesst@xxx.ac.nz

Jan 30 18:19:56 node1 sshd[15440]: Accepted keyboard-interactive/pam for xxxjonesst@xxx.ac.nz from 130.195.87.12 port 59402 ssh2

Would there be any relevant  changes between RHEL8's slurm and RHEL9's slurm?

[root@node1 ~]# rpm -qa |grep slurm

slurm-libs-20.11.9-1.el8.x86_64

slurm-slurmd-20.11.9-1.el8.x86_64

slurm-20.11.9-1.el8.x86_64

[root@node1 ~]#


I would have to go back and check but I do not think I hit this on RHEL9  what I did get was srun ver22 on the RHEL9 

server didnt like srun ver20 on the rocky8 node.

Can I compile / rpm build  srun ver22 to run on rocky8?  or is that part of slurmd?

 

regards

Steven 


From: John Hearns <hearnsj@gmail.com>
Sent: Thursday, 30 January 2025 10:53 pm
To: Steven Jones <
steven.jones@vuw.ac.nz>
Cc:
slurm-users@schedmd.com <slurm-users@schedmd.com>
Subject: Re: [slurm-users] RHEL8.10 V slurmctld

 

You don't often get email from hearnsj@gmail.com. Learn why this is important

Have you run id on a computer node?

 

On Wed, Jan 29, 2025, 6:47PM Steven Jones via slurm-users <slurm-users@lists.schedmd.com> wrote:

I am using Redhat's  IdM/IPA for users

 

Slurmctld is failing to run jobs and it is getting "invalid user id".

 

"2025-01-28T21:48:50.271] sched: Allocate JobId=4 NodeList=node4 #CPUs=1 Partition=debug

[2025-01-28T21:48:50.280] Killing non-startable batch JobId=4: Invalid user id"

 

id on the slurm controller works fine.  

 

[xxxjoness@xxx.ac.nz@hpcunidrslurmd2 ~]$ id xxxjoness@xxx.ac.nz
uid=1204805830(
xxxjoness@xxx.ac.nz) gid=1204805830(xxxjoness@xxx.ac.nz) groups=1204805830(xxxjoness@xxx.ac.nz)  8><---

Any ideas please?  because I am out.....

 

I have tried RHEL9.5, this seemed to run but  srun is version 22 and on rocky8 it is version20 so fails.

 

regards

Steven 


--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com