[slurm-users] Cgroup task plugin fails if ConstrainRAMSpace and ConstrainKmemSpace are enabled

Taras Shapovalov tshapovalov at nvidia.com
Fri Aug 19 09:11:54 UTC 2022


Hey guys,

We noticed that Slurm memory constrain options (via cgroups) on CentOS 7 upstream kernel <= 4.5 breaks cgroup task plugin. Reproduced with Slurm 21.08.8.

Jobs fail to start:

# srun --mem=1MB hostname
srun: error: task 0 launch failed: Slurmd could not execve job

In slurmd log (with debug) we see:

[2022-08-19T10:59:51.802] [6.0] debug2: setup for a launch_task
[2022-08-19T10:59:51.803] [6.0] debug:  jobacct_gather/linux: init: Job accounting gather LINUX plugin loaded
[2022-08-19T10:59:51.804] debug2: Finish processing RPC: REQUEST_LAUNCH_TASKS
[2022-08-19T10:59:51.804] [6.0] debug2: profile signaling type Task
[2022-08-19T10:59:51.805] [6.0] debug:  Message thread started pid = 3386
[2022-08-19T10:59:51.807] [6.0] debug2: hwloc_topology_init
[2022-08-19T10:59:51.807] [6.0] debug2: xcpuinfo_hwloc_topo_load: xml file (/cm/local/apps/slurm/var/spool/hwloc_topo_whole.xml) found
[2022-08-19T10:59:51.808] [6.0] debug:  CPUs:2 Boards:1 Sockets:1 CoresPerSocket:1 ThreadsPerCore:2
[2022-08-19T10:59:51.810] [6.0] debug:  cgroup/v1: init: Cgroup v1 plugin loaded
[2022-08-19T10:59:51.814] [6.0] debug:  task/cgroup: task_cgroup_memory_init: task/cgroup/memory: total:1998M allowed:100%(enforced), swap:0%(permissive), max:100%(1998M) max+swap:100%(3996M) min:25M kmem:100%(1998M enforced) min:25M swappiness:1(set)
[2022-08-19T10:59:51.814] [6.0] debug:  task/cgroup: init: memory enforcement enabled
[2022-08-19T10:59:51.814] [6.0] debug:  task/cgroup: init: Tasks containment cgroup plugin loaded
[2022-08-19T10:59:51.815] [6.0] cred/munge: init: Munge credential signature plugin loaded
[2022-08-19T10:59:51.817] [6.0] debug:  job_container/none: init: job_container none plugin loaded
[2022-08-19T10:59:51.817] [6.0] debug:  mpi type = none
[2022-08-19T10:59:51.819] [6.0] debug2: Before call to spank_init()
[2022-08-19T10:59:51.819] [6.0] debug:  spank: opening plugin stack /cm/shared/apps/slurm/var/etc/slurm/plugstack.conf
[2022-08-19T10:59:51.819] [6.0] debug2: After call to spank_init()
[2022-08-19T10:59:51.819] [6.0] debug:  mpi type = (null)
[2022-08-19T10:59:51.819] [6.0] debug:  mpi/none: p_mpi_hook_slurmstepd_prefork: mpi/none: slurmstepd prefork
[2022-08-19T10:59:51.825] [6.0] task/cgroup: _memcg_initialize: job: alloc=0MB mem.limit=1998MB memsw.limit=unlimited
[2022-08-19T10:59:51.825] [6.0] debug:  task_g_pre_setuid: task/cgroup: Unspecified error
[2022-08-19T10:59:51.825] [6.0] error: Failed to invoke task plugins: one of task_p_pre_setuid functions returned error
[2022-08-19T10:59:51.825] [6.0] debug:  _fork_all_tasks failed
[2022-08-19T10:59:51.826] [6.0] debug:  signaling condition
[2022-08-19T10:59:51.826] [6.0] debug2: step_terminate_monitor will run for 60 secs
[2022-08-19T10:59:51.826] [6.0] debug2: step_terminate_monitor is stopping
[2022-08-19T10:59:51.826] [6.0] debug2: _monitor exit code: 0
[2022-08-19T10:59:51.826] [6.0] debug2: switch/none: switch_p_job_postfini: Sending SIGKILL to pgid 3386
[2022-08-19T10:59:51.841] [6.0] debug:  task/cgroup: fini: Tasks containment cgroup plugin unloaded
[2022-08-19T10:59:51.841] [6.0] debug2: Before call to spank_fini()
[2022-08-19T10:59:51.841] [6.0] debug2: After call to spank_fini()
[2022-08-19T10:59:51.841] [6.0] error: job_manager: exiting abnormally: Slurmd could not execve job
[2022-08-19T10:59:51.841] [6.0] debug:  Sending launch resp rc=4020
[2022-08-19T10:59:51.845] [6.0] debug2: Rank 0 has no children slurmstepd
[2022-08-19T10:59:51.845] [6.0] debug2: _one_step_complete_msg: first=0, last=0
[2022-08-19T10:59:51.851] [6.0] debug2:   false, shutdown
[2022-08-19T10:59:51.851] [6.0] debug:  Message thread exited
[2022-08-19T10:59:51.852] [6.0] done with job
[2022-08-19T10:59:51.865] debug2: Start processing RPC: REQUEST_TERMINATE_JOB
[2022-08-19T10:59:51.865] debug2: Processing RPC: REQUEST_TERMINATE_JOB
[2022-08-19T10:59:51.865] debug:  _rpc_terminate_job: uid = 450 JobId=6
[2022-08-19T10:59:51.865] debug:  credential for job 6 revoked
[2022-08-19T10:59:51.865] debug2: No steps in jobid 6 to send signal 18
[2022-08-19T10:59:51.866] debug2: No steps in jobid 6 to send signal 15
[2022-08-19T10:59:51.866] debug2: set revoke expiration for jobid 6 to 1660899711 UTS
[2022-08-19T10:59:51.866] debug:  Waiting for job 6's prolog to complete
[2022-08-19T10:59:51.866] debug:  Finished wait for job 6's prolog to complete
[2022-08-19T10:59:51.867] debug:  [job 6] attempting to run epilog [/cm/local/apps/cmd/scripts/epilog]
[2022-08-19T10:59:51.880] debug:  completed epilog for jobid 6
[2022-08-19T10:59:51.881] debug:  JobId=6: sent epilog complete msg: rc = 0
[2022-08-19T10:59:51.881] debug2: Finish processing RPC: REQUEST_TERMINATE_JOB


cgroups.conf:

AllowedDevicesFile="/etc/slurm/cgroup_allowed_devices_file.conf"
CgroupMountpoint="/sys/fs/cgroup"
CgroupAutomount=no
ConstrainCores=no
ConstrainRAMSpace=yes   <---------------
ConstrainSwapSpace=no
ConstrainDevices=no
ConstrainKmemSpace=yes  <--------------
AllowedRamSpace=100.00
AllowedSwapSpace=0.00
MinKmemSpace=25
MaxKmemPercent=100.00
MemorySwappiness=1
MaxRAMPercent=100.00
MaxSwapPercent=100.00
MinRAMSpace=25

If only one of those 2 options are enabled, then the issue is gone. Updating to kernel >=4.6 fixes the plugin.

Has anyone faced this issue?

Best reagrds,

Taras
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220819/607af308/attachment-0001.htm>


More information about the slurm-users mailing list