extern step does not exit
I am building a new small cluster on Rocky Linux 9 with Slurm 24.11.5. Slurm was compiled with the default options except that we added --with pmix We do use some additional complications - we use NFSv4 with AD authentication (so we use auks), and we use SElinux in enforcing mode. However, I do not currently see any evidence that those relate to the problem that we have; though note that the slurm account is an AD account. All relevant AD accounts have the RFC 2307 attributes (uid, gid etc.) configured. I had also configured pam_slurm_adopt but as that was causing issues I backed out the changes that enable it. It was too hard to troubleshoot without being able to login to the test compute node easily. The issue I am having in testing is that whatever job I launch, when it terminates the extern job step does not terminate. These are the more relevant slurm.conf settings: Epilog=/etc/slurm/epilog.sh JobAcctGatherType=jobacct_gather/cgroup JobCompType=jobcomp/none KillOnBadExit=1 KillWait=30 ProctrackType=proctrack/cgroup Prolog=/etc/slurm/prolog.sh PrologFlags=Alloc,Contain SelectTypeParameters=CR_Core_Memory SelectType=select/cons_tres SlurmUser=slurm TaskPlugin=task/affinity,task/cgroup TaskProlog=/etc/slurm/taskprolog.sh The slurmd.log on the compute node for the job shows a sequence like this (with -vv debug enabled): [2025-06-26T14:57:23.272] select/cons_tres: init: select/cons_tres loaded [2025-06-26T14:57:23.272] select/linear: init: Linear node selection plugin loaded with argument 20 [2025-06-26T14:57:23.272] cred/munge: init: Munge credential signature plugin loaded [2025-06-26T14:57:23.272] [34.extern] debug: auth/munge: init: loaded [2025-06-26T14:57:23.273] [34.extern] debug: Reading cgroup.conf file /etc/slurm/cgroup.conf [2025-06-26T14:57:23.279] [34.extern] debug: cgroup/v2: init: Cgroup v2 plugin loaded [2025-06-26T14:57:23.290] [34.extern] debug: hash/k12: init: init: KangarooTwelve hash plugin loaded [2025-06-26T14:57:23.291] [34.extern] debug: CPUs:40 Boards:1 Sockets:2 CoresPerSocket:20 ThreadsPerCore:1 [2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: core enforcement enabled [2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: task_cgroup_memory_init: task/cgroup/memory: TotCfgRealMem:191587M allowed:100%(enforced), swap:0%(enforced), max:100%(191587M) max+swap:100%(383174M) min:30M [2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: memory enforcement enabled [2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: device enforcement enabled [2025-06-26T14:57:23.291] [34.extern] debug: task/cgroup: init: Tasks containment cgroup plugin loaded [2025-06-26T14:57:23.291] [34.extern] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffffff [2025-06-26T14:57:23.291] [34.extern] debug: jobacct_gather/cgroup: init: Job accounting gather cgroup plugin loaded [2025-06-26T14:57:23.291] [34.extern] topology/default: init: topology Default plugin loaded [2025-06-26T14:57:23.292] [34.extern] debug: gpu/generic: init: init: GPU Generic plugin loaded [2025-06-26T14:57:23.293] [34.extern] debug: Setting slurmstepd(2666071) oom_score_adj to -1000 [2025-06-26T14:57:23.293] [34.extern] debug: Message thread started pid = 2666071 [2025-06-26T14:57:23.294] [34.extern] debug: spank: opening plugin stack /etc/slurm/plugstack.conf [2025-06-26T14:57:23.294] [34.extern] debug: /etc/slurm/plugstack.conf: 1: include "/etc/slurm/plugstack.conf.d/*.conf" [2025-06-26T14:57:23.294] [34.extern] debug: spank: opening plugin stack /etc/slurm/plugstack.conf.d/auks.conf [2025-06-26T14:57:23.298] [34.extern] debug: spank: /etc/slurm/plugstack.conf.d/auks.conf:57: Loaded plugin auks.so [2025-06-26T14:57:23.298] [34.extern] debug: SPANK: appending plugin option "auks" [2025-06-26T14:57:23.352] [34.extern] spank-auks: new unique ccache is KCM:71433:49222 <<<< This UID 71433 is the user running the job [2025-06-26T14:57:23.364] [34.extern] spank-auks: user '71433' cred stored in ccache KCM:71433:49222 [2025-06-26T14:57:23.380] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: job abstract cores are '0' [2025-06-26T14:57:23.381] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: step abstract cores are '0' [2025-06-26T14:57:23.381] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: job physical CPUs are '0' [2025-06-26T14:57:23.381] [34.extern] debug: task/cgroup: task_cgroup_cpuset_create: step physical CPUs are '0' [2025-06-26T14:57:23.381] [34.extern] task/cgroup: _memcg_initialize: job: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614 [2025-06-26T14:57:23.381] [34.extern] task/cgroup: _memcg_initialize: step: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614 [2025-06-26T14:57:23.385] [34.extern] debug: close_slurmd_conn: sending 0: No error [2025-06-26T14:57:24.371] launch task StepId=34.interactive request from UID:71433 GID:70668 HOST:172.17.11.22 PORT:46948 [2025-06-26T14:57:24.372] task/affinity: lllp_distribution: JobId=34 manual binding: mask_cpu,one_thread [2025-06-26T14:57:24.372] debug: Waiting for job 34's prolog to complete [2025-06-26T14:57:24.372] debug: Finished wait for job 34's prolog to complete [2025-06-26T14:57:24.377] select/cons_tres: init: select/cons_tres loaded [2025-06-26T14:57:24.377] select/linear: init: Linear node selection plugin loaded with argument 20 [2025-06-26T14:57:24.377] cred/munge: init: Munge credential signature plugin loaded [2025-06-26T14:57:24.377] [34.interactive] debug: auth/munge: init: loaded [2025-06-26T14:57:24.379] [34.interactive] debug: Reading cgroup.conf file /etc/slurm/cgroup.conf [2025-06-26T14:57:24.384] [34.interactive] debug: cgroup/v2: init: Cgroup v2 plugin loaded [2025-06-26T14:57:24.403] [34.interactive] debug: hash/k12: init: init: KangarooTwelve hash plugin loaded [2025-06-26T14:57:24.404] [34.interactive] debug: CPUs:40 Boards:1 Sockets:2 CoresPerSocket:20 ThreadsPerCore:1 [2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: core enforcement enabled [2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: task_cgroup_memory_init: task/cgroup/memory: TotCfgRealMem:191587M allowed:100%(enforced), swap:0%(enforced), max:100%(191587M) max+swap:100%(383174M) min:30M [2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: memory enforcement enabled [2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: device enforcement enabled [2025-06-26T14:57:24.404] [34.interactive] debug: task/cgroup: init: Tasks containment cgroup plugin loaded [2025-06-26T14:57:24.404] [34.interactive] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffffff [2025-06-26T14:57:24.404] [34.interactive] debug: jobacct_gather/cgroup: init: Job accounting gather cgroup plugin loaded [2025-06-26T14:57:24.404] [34.interactive] topology/default: init: topology Default plugin loaded [2025-06-26T14:57:24.404] [34.interactive] debug: gpu/generic: init: init: GPU Generic plugin loaded [2025-06-26T14:57:24.406] [34.interactive] debug: close_slurmd_conn: sending 0: No error [2025-06-26T14:57:24.406] [34.interactive] debug: Message thread started pid = 2666089 [2025-06-26T14:57:24.406] [34.interactive] debug: Setting slurmstepd(2666089) oom_score_adj to -1000 [2025-06-26T14:57:24.407] [34.interactive] debug: spank: opening plugin stack /etc/slurm/plugstack.conf [2025-06-26T14:57:24.407] [34.interactive] debug: /etc/slurm/plugstack.conf: 1: include "/etc/slurm/plugstack.conf.d/*.conf" [2025-06-26T14:57:24.407] [34.interactive] debug: spank: opening plugin stack /etc/slurm/plugstack.conf.d/auks.conf [2025-06-26T14:57:24.411] [34.interactive] debug: spank: /etc/slurm/plugstack.conf.d/auks.conf:57: Loaded plugin auks.so [2025-06-26T14:57:24.411] [34.interactive] debug: SPANK: appending plugin option "auks" [2025-06-26T14:57:24.461] [34.interactive] spank-auks: new unique ccache is KCM:71433:44156 [2025-06-26T14:57:24.467] [34.interactive] spank-auks: user '71433' cred stored in ccache KCM:71433:44156 [2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: job abstract cores are '0' [2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: step abstract cores are '0' [2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: job physical CPUs are '0' [2025-06-26T14:57:24.480] [34.interactive] debug: task/cgroup: task_cgroup_cpuset_create: step physical CPUs are '0' [2025-06-26T14:57:24.481] [34.interactive] task/cgroup: _memcg_initialize: job: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614 [2025-06-26T14:57:24.481] [34.interactive] task/cgroup: _memcg_initialize: step: alloc=1024MB mem.limit=1024MB memsw.limit=1024MB job_swappiness=18446744073709551614 [2025-06-26T14:57:24.501] [34.interactive] warning: restricted to a subset of cpus [2025-06-26T14:57:24.503] [34.interactive] debug: stdin uses a pty object [2025-06-26T14:57:24.504] [34.interactive] debug: init pty size 48:324 [2025-06-26T14:57:24.506] [34.interactive] debug levels are stderr='error', logfile='debug', syslog='quiet' [2025-06-26T14:57:24.507] [34.interactive] debug: IO handler started pid=2666089 [2025-06-26T14:57:24.517] [34.interactive] spank-auks: credential renewer launched (pid=2666107) [2025-06-26T14:57:24.517] [34.interactive] starting 1 tasks [2025-06-26T14:57:24.517] [34.interactive] task 0 (2666108) started 2025-06-26T14:57:24 [2025-06-26T14:57:24.537] [34.interactive] debug: task/affinity: task_p_pre_launch: affinity StepId=34.interactive, task:0 bind:mask_cpu,one_thread [2025-06-26T14:57:24.537] [34.interactive] debug: [job 34] attempting to run slurm task_prolog [/etc/slurm/taskprolog.sh] [2025-06-26T14:57:24.537] [34.interactive] debug: Sending launch resp rc=0 [2025-06-26T14:57:24.563] [34.interactive] debug: export name:TMPDIR:val:/local/scratch/34: [2025-06-26T14:59:42.726] [34.interactive] task 0 (2666108) exited with exit code 0. <<<< Here the interactive job was closed [2025-06-26T14:59:42.727] [34.interactive] spank-auks: all tasks exited, killing credential renewer (pid=2666107) [2025-06-26T14:59:42.729] [34.interactive] debug: task/affinity: task_p_post_term: affinity StepId=34.interactive, task 0 [2025-06-26T14:59:42.729] [34.interactive] debug: signaling condition [2025-06-26T14:59:42.729] [34.interactive] debug: jobacct_gather/cgroup: fini: Job accounting gather cgroup plugin unloaded [2025-06-26T14:59:42.729] [34.interactive] debug: Waiting for IO [2025-06-26T14:59:42.729] [34.interactive] debug: Closing debug channel [2025-06-26T14:59:42.729] [34.interactive] debug: IO handler exited, rc=0 [2025-06-26T14:59:42.729] [34.interactive] debug: task/cgroup: fini: Tasks containment cgroup plugin unloaded [2025-06-26T14:59:42.731] [34.interactive] debug: slurm_recv_timeout at 0 of 4, recv zero bytes [2025-06-26T14:59:42.738] [34.interactive] spank-auks: Destroyed ccache KCM:71433:44156 [2025-06-26T14:59:42.740] debug: _rpc_terminate_job: uid = 71755 JobId=34 <<<< That uid 71755 is for the slurm account which does not run any daemons on a compute node [2025-06-26T14:59:42.740] debug: credential for job 34 revoked [2025-06-26T14:59:42.742] [34.extern] debug: Handling REQUEST_SIGNAL_CONTAINER [2025-06-26T14:59:42.742] [34.extern] debug: _handle_signal_container for StepId=34.extern uid=71755 signal=18 flag=0x0 [2025-06-26T14:59:42.742] [34.extern] Sent signal 18 to StepId=34.extern [2025-06-26T14:59:42.742] [34.interactive] debug: Handling REQUEST_SIGNAL_CONTAINER [2025-06-26T14:59:42.742] [34.interactive] debug: _handle_signal_container for StepId=34.interactive uid=71755 signal=18 flag=0x0 [2025-06-26T14:59:42.742] [34.interactive] Sent signal 18 to StepId=34.interactive [2025-06-26T14:59:42.743] [34.extern] debug: Handling REQUEST_SIGNAL_CONTAINER [2025-06-26T14:59:42.743] [34.extern] debug: _handle_signal_container for StepId=34.extern uid=71755 signal=15 flag=0x0 [2025-06-26T14:59:42.743] [34.extern] Sent signal 15 to StepId=34.extern [2025-06-26T14:59:42.743] [34.interactive] debug: Handling REQUEST_SIGNAL_CONTAINER [2025-06-26T14:59:42.743] [34.interactive] debug: _handle_signal_container for StepId=34.interactive uid=71755 signal=15 flag=0x0 [2025-06-26T14:59:42.743] [34.interactive] Sent signal 15 to StepId=34.interactive [2025-06-26T14:59:42.744] [34.interactive] debug: Handling REQUEST_STATE [2025-06-26T14:59:42.744] [34.interactive] debug: Message thread exited [2025-06-26T14:59:42.744] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:42.763] [34.interactive] done with step [2025-06-26T14:59:42.766] [34.extern] debug: signaling condition [2025-06-26T14:59:42.766] [34.extern] debug: task/affinity: task_p_post_term: affinity StepId=34.extern, task 0 [2025-06-26T14:59:42.766] [34.extern] debug: jobacct_gather/cgroup: fini: Job accounting gather cgroup plugin unloaded [2025-06-26T14:59:42.766] [34.extern] debug: task/cgroup: fini: Tasks containment cgroup plugin unloaded [2025-06-26T14:59:42.766] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:42.766] [34.extern] debug: Terminate signal (SIGTERM) received [2025-06-26T14:59:42.768] [34.extern] error: setgroups: Operation not permitted [2025-06-26T14:59:42.768] [34.extern] error: _shutdown_x11_forward: Unable to drop privileges [2025-06-26T14:59:42.771] [34.extern] spank-auks: Destroyed ccache KCM:71433:49222 [2025-06-26T14:59:42.817] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:42.918] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:43.419] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:44.420] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:45.420] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:46.421] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:47.422] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:48.423] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:49.424] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:50.425] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:51.426] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:52.426] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T14:59:53.427] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:03.428] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:12.429] [34.extern] debug: Handling REQUEST_STEP_TERMINATE [2025-06-26T15:00:12.429] [34.extern] debug: _handle_terminate for StepId=34.extern uid=0 [2025-06-26T15:00:12.429] [34.extern] Sent SIGKILL signal to StepId=34.extern [2025-06-26T15:00:12.429] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:12.450] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:12.501] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:12.602] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:13.102] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:14.103] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:14.104] [34.extern] debug: Handling REQUEST_STEP_TERMINATE [2025-06-26T15:00:14.104] [34.extern] debug: _handle_terminate for StepId=34.extern uid=0 [2025-06-26T15:00:14.104] [34.extern] Sent SIGKILL signal to StepId=34.extern [2025-06-26T15:00:15.105] [34.extern] debug: Handling REQUEST_STATE [2025-06-26T15:00:15.105] [34.extern] debug: Handling REQUEST_STEP_TERMINATE . . We are left with a process "slurmstepd: [34.extern]" running, and a matching socket file for job 34 in /var/spool/slurmd; the job shows state 'CG' (Completing). I can clean up by issuing a kill -9 to the slurmstepd 'extern' job on the compute node. I do not know a way to get any logging output from that job step, but it has a file descriptor open to /var/log/slurm/slurmd.log (which is what we configured) so I assume that it is writing some of the above log. In some runs I also see this just as the main task exits (but from the .0 job step, not from .extern) : [2025-06-27T21:00:47.265] [39.0] error: common_file_write_content: unable to open '/sys/fs/cgroup/system.slice/slurmstepd.scope/job_39/step_0/user/cgroup.free ze' for writing: Permission denied I can see that between reboots, the directories '/sys/fs/cgroup/system.slice/slurmstepd.scope/job_NN' remain, with the /sys/fs/cgroup/system.slice/slurmstepd.scope/job_NN/step_extern folder. The permissions of the files referred to as 'permission denied' seem fine to me: # ls -lhZ /sys/fs/cgroup/system.slice/slurmstepd.scope/job_42/step_extern/user/cgroup. freeze -rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jun 27 21:46 /sys/fs/cgroup/system.slice/slurmstepd.scope/job_42/step_extern/user/cgroup. freeze There are no SElinux alerts on the system. I am not sure whether the error messages about setgroups and _shutdown_x11_forward are actually the problem, or just something else being reported. The only system that I have to compare with is running Slurm 19.05 on CentOS 7 and is rather different. I would be interested to know if anyone else has had problems with extern job steps not shutting down. William
participants (1)
-
william@signalbox.org.uk