I am building a new small cluster on Rocky Linux 9 with Slurm 24.11.5. Slurm was compiled with the default options except that we added --with pmix

We do use some additional complications – we use NFSv4 with AD authentication (so we use auks), and we use SElinux in enforcing mode. However, I do not currently see any evidence that those relate to the problem that we have; though note that the slurm account is an AD account.

All relevant AD accounts have the RFC 2307 attributes (uid, gid etc.) configured.

I had also configured pam_slurm_adopt but as that was causing issues I backed out the changes that enable it. It was too hard to troubleshoot without being able to login to the test compute node easily.

The issue I am having in testing is that whatever job I launch, when it terminates the extern job step does not terminate.

These are the more relevant slurm.conf settings:

Epilog=/etc/slurm/epilog.sh

JobAcctGatherType=jobacct_gather/cgroup

JobCompType=jobcomp/none

KillOnBadExit=1

KillWait=30

ProctrackType=proctrack/cgroup

Prolog=/etc/slurm/prolog.sh

PrologFlags=Alloc,Contain

SelectTypeParameters=CR_Core_Memory

SelectType=select/cons_tres

SlurmUser=slurm

TaskPlugin=task/affinity,task/cgroup

TaskProlog=/etc/slurm/taskprolog.sh

The slurmd.log on the compute node for the job shows a sequence like this (with -vv debug enabled):

[2025-06-26T14:57:23.272] select/cons_tres: init: select/cons_tres loaded

[2025-06-26T14:57:23.272] select/linear: init: Linear node selection plugin loaded with argument 20

[2025-06-26T14:57:23.272] cred/munge: init: Munge credential signature plugin loaded

[2025-06-26T14:57:23.272] [34.extern] debug: auth/munge: init: loaded

[2025-06-26T14:57:23.273] [34.extern] debug: Reading cgroup.conf file /etc/slurm/cgroup.conf

[2025-06-26T14:57:23.279] [34.extern] debug: cgroup/v2: init: Cgroup v2 plugin loaded

[2025-06-26T14:57:23.290] [34.extern] debug: hash/k12: init: init: KangarooTwelve hash plugin loaded

[2025-06-26T14:57:23.291] [34.extern] debug: CPUs:40 Boards:1 Sockets:2 CoresPerSocket:20 ThreadsPerCore:1

[2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: core enforcement enabled

[2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: task_cgroup_memory_init: task/cgroup/memory: TotCfgRealMem:191587M allowed:100%(enforced), swap:0%(enforced), max:100%(191587M) max+swap:100%(383174M) min:30M

[2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: memory enforcement enabled

[2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: device enforcement enabled

[2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: Tasks containment cgroup plugin loaded

[2025-06-26T14:57:23.291] [34.extern] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffffff

[2025-06-26T14:57:23.291] [34.extern] debug: jobacct_gather/cgroup: init: Job accounting gather cgroup plugin loaded

[2025-06-26T14:57:23.291] [34.extern] topology/default: init: topology Default plugin loaded

[2025-06-26T14:57:23.292] [34.extern] debug: gpu/generic: init: init: GPU Generic plugin loaded

[2025-06-26T14:57:23.293] [34.extern] debug: Setting slurmstepd(2666071) oom_score_adj to -1000

[2025-06-26T14:57:23.293] [34.extern] debug: Message thread started pid = 2666071

[2025-06-26T14:57:23.294] [34.extern] debug: spank: opening plugin stack /etc/slurm/plugstack.conf

[2025-06-26T14:57:23.294] [34.extern] debug: /etc/slurm/plugstack.conf: 1: include "/etc/slurm/plugstack.conf.d/*.conf"

[2025-06-26T14:57:23.294] [34.extern] debug: spank: opening plugin stack /etc/slurm/plugstack.conf.d/auks.conf

[2025-06-26T14:57:23.298] [34.extern] debug: spank: /etc/slurm/plugstack.conf.d/auks.conf:57: Loaded plugin auks.so

[2025-06-26T14:57:23.298] [34.extern] debug: SPANK: appending plugin option "auks"

[2025-06-26T14:57:23.352] [34.extern] spank-auks: new unique ccache is KCM:71433:49222 <<<< This UID 71433 is the user running the job

[2025-06-26T14:57:23.364] [34.extern] spank-auks: user '71433' cred stored in ccache KCM:71433:49222

[2025-06-26T14:57:23.380] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: job abstract cores are '0'

[2025-06-26T14:57:23.381] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: step abstract cores are '0'

[2025-06-26T14:57:23.381] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: job physical CPUs are '0'

[2025-06-26T14:57:23.381] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: step physical CPUs are '0'

[2025-06-26T14:57:23.381] [34.extern] task/cgroup: _memcg_initialize: job: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614

[2025-06-26T14:57:23.381] [34.extern] task/cgroup: _memcg_initialize: step: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614

[2025-06-26T14:57:23.385] [34.extern] debug: close_slurmd_conn: sending 0: No error

[2025-06-26T14:57:24.371] launch task StepId=34.interactive request from UID:71433 GID:70668 HOST:172.17.11.22 PORT:46948

[2025-06-26T14:57:24.372] task/affinity: lllp_distribution: JobId=34 manual binding: mask_cpu,one_thread

[2025-06-26T14:57:24.372] debug: Waiting for job 34's prolog to complete

[2025-06-26T14:57:24.372] debug: Finished wait for job 34's prolog to complete

[2025-06-26T14:57:24.377] select/cons_tres: init: select/cons_tres loaded

[2025-06-26T14:57:24.377] select/linear: init: Linear node selection plugin loaded with argument 20

[2025-06-26T14:57:24.377] cred/munge: init: Munge credential signature plugin loaded

[2025-06-26T14:57:24.377] [34.interactive] debug: auth/munge: init: loaded

[2025-06-26T14:57:24.379] [34.interactive] debug: Reading cgroup.conf file /etc/slurm/cgroup.conf

[2025-06-26T14:57:24.384] [34.interactive] debug: cgroup/v2: init: Cgroup v2 plugin loaded

[2025-06-26T14:57:24.403] [34.interactive] debug: hash/k12: init: init: KangarooTwelve hash plugin loaded

[2025-06-26T14:57:24.404] [34.interactive] debug: CPUs:40 Boards:1 Sockets:2 CoresPerSocket:20 ThreadsPerCore:1

[2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: core enforcement enabled

[2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: task_cgroup_memory_init: task/cgroup/memory: TotCfgRealMem:191587M allowed:100%(enforced), swap:0%(enforced), max:100%(191587M) max+swap:100%(383174M) min:30M

[2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: memory enforcement enabled

[2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: device enforcement enabled

[2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: Tasks containment cgroup plugin loaded

[2025-06-26T14:57:24.404] [34.interactive] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffffff

[2025-06-26T14:57:24.404] [34.interactive] debug: jobacct_gather/cgroup: init: Job accounting gather cgroup plugin loaded

[2025-06-26T14:57:24.404] [34.interactive] topology/default: init: topology Default plugin loaded

[2025-06-26T14:57:24.404] [34.interactive] debug: gpu/generic: init: init: GPU Generic plugin loaded

[2025-06-26T14:57:24.406] [34.interactive] debug: close_slurmd_conn: sending 0: No error

[2025-06-26T14:57:24.406] [34.interactive] debug: Message thread started pid = 2666089

[2025-06-26T14:57:24.406] [34.interactive] debug: Setting slurmstepd(2666089) oom_score_adj to -1000

[2025-06-26T14:57:24.407] [34.interactive] debug: spank: opening plugin stack /etc/slurm/plugstack.conf

[2025-06-26T14:57:24.407] [34.interactive] debug: /etc/slurm/plugstack.conf: 1: include "/etc/slurm/plugstack.conf.d/*.conf"

[2025-06-26T14:57:24.407] [34.interactive] debug: spank: opening plugin stack /etc/slurm/plugstack.conf.d/auks.conf

[2025-06-26T14:57:24.411] [34.interactive] debug: spank: /etc/slurm/plugstack.conf.d/auks.conf:57: Loaded plugin auks.so

[2025-06-26T14:57:24.411] [34.interactive] debug: SPANK: appending plugin option "auks"

[2025-06-26T14:57:24.461] [34.interactive] spank-auks: new unique ccache is KCM:71433:44156

[2025-06-26T14:57:24.467] [34.interactive] spank-auks: user '71433' cred stored in ccache KCM:71433:44156

[2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: job abstract cores are '0'

[2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: step abstract cores are '0'

[2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: job physical CPUs are '0'

[2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: step physical CPUs are '0'

[2025-06-26T14:57:24.481] [34.interactive] task/cgroup: _memcg_initialize: job: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614

[2025-06-26T14:57:24.481] [34.interactive] task/cgroup: _memcg_initialize: step: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614

[2025-06-26T14:57:24.501] [34.interactive] warning: restricted to a subset of cpus

[2025-06-26T14:57:24.503] [34.interactive] debug: stdin uses a pty object

[2025-06-26T14:57:24.504] [34.interactive] debug: init pty size 48:324

[2025-06-26T14:57:24.506] [34.interactive] debug levels are stderr='error', logfile='debug', syslog='quiet'

[2025-06-26T14:57:24.507] [34.interactive] debug: IO handler started pid=2666089

[2025-06-26T14:57:24.517] [34.interactive] spank-auks: credential renewer launched (pid=2666107)

[2025-06-26T14:57:24.517] [34.interactive] starting 1 tasks

[2025-06-26T14:57:24.517] [34.interactive] task 0 (2666108) started 2025-06-26T14:57:24

[2025-06-26T14:57:24.537] [34.interactive] debug: task/affinity: task_p_pre_launch: affinity StepId=34.interactive, task:0 bind:mask_cpu,one_thread

[2025-06-26T14:57:24.537] [34.interactive] debug: [job 34] attempting to run slurm task_prolog [/etc/slurm/taskprolog.sh]

[2025-06-26T14:57:24.537] [34.interactive] debug: Sending launch resp rc=0

[2025-06-26T14:57:24.563] [34.interactive] debug: export name:TMPDIR:val:/local/scratch/34:

[2025-06-26T14:59:42.726] [34.interactive] task 0 (2666108) exited with exit code 0. <<<< Here the interactive job was closed

[2025-06-26T14:59:42.727] [34.interactive] spank-auks: all tasks exited, killing credential renewer (pid=2666107)

[2025-06-26T14:59:42.729] [34.interactive] debug: task/affinity: task_p_post_term: affinity StepId=34.interactive, task 0

[2025-06-26T14:59:42.729] [34.interactive] debug: signaling condition

[2025-06-26T14:59:42.729] [34.interactive] debug: jobacct_gather/cgroup: fini: Job accounting gather cgroup plugin unloaded

[2025-06-26T14:59:42.729] [34.interactive] debug: Waiting for IO

[2025-06-26T14:59:42.729] [34.interactive] debug: Closing debug channel

[2025-06-26T14:59:42.729] [34.interactive] debug: IO handler exited, rc=0

[2025-06-26T14:59:42.729] [34.interactive] debug: task/cgroup: fini: Tasks containment cgroup plugin unloaded

[2025-06-26T14:59:42.731] [34.interactive] debug: slurm_recv_timeout at 0 of 4, recv zero bytes

[2025-06-26T14:59:42.738] [34.interactive] spank-auks: Destroyed ccache KCM:71433:44156

[2025-06-26T14:59:42.740] debug: _rpc_terminate_job: uid = 71755 JobId=34 <<<< That uid 71755 is for the slurm account which does not run any daemons on a compute node

[2025-06-26T14:59:42.740] debug: credential for job 34 revoked

[2025-06-26T14:59:42.742] [34.extern] debug: Handling REQUEST_SIGNAL_CONTAINER

[2025-06-26T14:59:42.742] [34.extern] debug: _handle_signal_container for StepId=34.extern uid=71755 signal=18 flag=0x0

[2025-06-26T14:59:42.742] [34.extern] Sent signal 18 to StepId=34.extern

[2025-06-26T14:59:42.742] [34.interactive] debug: Handling REQUEST_SIGNAL_CONTAINER

[2025-06-26T14:59:42.742] [34.interactive] debug: _handle_signal_container for StepId=34.interactive uid=71755 signal=18 flag=0x0

[2025-06-26T14:59:42.742] [34.interactive] Sent signal 18 to StepId=34.interactive

[2025-06-26T14:59:42.743] [34.extern] debug: Handling REQUEST_SIGNAL_CONTAINER

[2025-06-26T14:59:42.743] [34.extern] debug: _handle_signal_container for StepId=34.extern uid=71755 signal=15 flag=0x0

[2025-06-26T14:59:42.743] [34.extern] Sent signal 15 to StepId=34.extern

[2025-06-26T14:59:42.743] [34.interactive] debug: Handling REQUEST_SIGNAL_CONTAINER

[2025-06-26T14:59:42.743] [34.interactive] debug: _handle_signal_container for StepId=34.interactive uid=71755 signal=15 flag=0x0

[2025-06-26T14:59:42.743] [34.interactive] Sent signal 15 to StepId=34.interactive

[2025-06-26T14:59:42.744] [34.interactive] debug: Handling REQUEST_STATE

[2025-06-26T14:59:42.744] [34.interactive] debug: Message thread exited

[2025-06-26T14:59:42.744] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:42.763] [34.interactive] done with step

[2025-06-26T14:59:42.766] [34.extern] debug: signaling condition

[2025-06-26T14:59:42.766] [34.extern] debug: task/affinity: task_p_post_term: affinity StepId=34.extern, task 0

[2025-06-26T14:59:42.766] [34.extern] debug: jobacct_gather/cgroup: fini: Job accounting gather cgroup plugin unloaded

[2025-06-26T14:59:42.766] [34.extern] debug: task/cgroup: fini: Tasks containment cgroup plugin unloaded

[2025-06-26T14:59:42.766] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:42.766] [34.extern] debug: Terminate signal (SIGTERM) received

[2025-06-26T14:59:42.768] [34.extern] error: setgroups: Operation not permitted

[2025-06-26T14:59:42.768] [34.extern] error: _shutdown_x11_forward: Unable to drop privileges

[2025-06-26T14:59:42.771] [34.extern] spank-auks: Destroyed ccache KCM:71433:49222

[2025-06-26T14:59:42.817] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:42.918] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:43.419] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:44.420] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:45.420] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:46.421] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:47.422] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:48.423] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:49.424] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:50.425] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:51.426] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:52.426] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T14:59:53.427] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:03.428] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:12.429] [34.extern] debug: Handling REQUEST_STEP_TERMINATE

[2025-06-26T15:00:12.429] [34.extern] debug: _handle_terminate for StepId=34.extern uid=0

[2025-06-26T15:00:12.429] [34.extern] Sent SIGKILL signal to StepId=34.extern

[2025-06-26T15:00:12.429] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:12.450] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:12.501] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:12.602] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:13.102] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:14.103] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:14.104] [34.extern] debug: Handling REQUEST_STEP_TERMINATE

[2025-06-26T15:00:14.104] [34.extern] debug: _handle_terminate for StepId=34.extern uid=0

[2025-06-26T15:00:14.104] [34.extern] Sent SIGKILL signal to StepId=34.extern

[2025-06-26T15:00:15.105] [34.extern] debug: Handling REQUEST_STATE

[2025-06-26T15:00:15.105] [34.extern] debug: Handling REQUEST_STEP_TERMINATE

We are left with a process “slurmstepd: [34.extern]” running, and a matching socket file for job 34 in /var/spool/slurmd; the job shows state ‘CG’ (Completing). I can clean up by issuing a kill -9 to the slurmstepd ‘extern’ job on the compute node. I do not know a way to get any logging output from that job step, but it has a file descriptor open to /var/log/slurm/slurmd.log (which is what we configured) so I assume that it is writing some of the above log.

In some runs I also see this just as the main task exits (but from the .0 job step, not from .extern) :

[2025-06-27T21:00:47.265] [39.0] error: common_file_write_content: unable to open '/sys/fs/cgroup/system.slice/slurmstepd.scope/job_39/step_0/user/cgroup.freeze' for writing: Permission denied

I can see that between reboots, the directories '/sys/fs/cgroup/system.slice/slurmstepd.scope/job_NN’ remain, with the /sys/fs/cgroup/system.slice/slurmstepd.scope/job_NN/step_extern folder. The permissions of the files referred to as ‘permission denied’ seem fine to me:

# ls -lhZ /sys/fs/cgroup/system.slice/slurmstepd.scope/job_42/step_extern/user/cgroup.freeze

-rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jun 27 21:46 /sys/fs/cgroup/system.slice/slurmstepd.scope/job_42/step_extern/user/cgroup.freeze

There are no SElinux alerts on the system. I am not sure whether the error messages about setgroups and _shutdown_x11_forward are actually the problem, or just something else being reported. The only system that I have to compare with is running Slurm 19.05 on CentOS 7 and is rather different.

I would be interested to know if anyone else has had problems with extern job steps not shutting down.

William