[slurm-users] Problems with cgroupsv2

Alan Orth alan.orth at gmail.com
Tue Aug 16 19:53:43 UTC 2022


Dear list,

I've been using cgroupsv2 with SLURM 22.05 on CentOS Stream 8 successfully
for a few months now. Recently a few of my nodes have started having
problems starting slurmd. The log shows:

[2022-08-16T20:52:58.439] slurmd version 22.05.3 started
[2022-08-16T20:52:58.439] error: Controller cpuset is not enabled!
[2022-08-16T20:52:58.439] error: Controller cpu is not enabled!
[2022-08-16T20:52:58.439] error: cpu cgroup controller is not available.
[2022-08-16T20:52:58.439] error: There's an issue initializing memory or
cpu controller
[2022-08-16T20:52:58.439] error: Couldn't load specified plugin name for
jobacct_gather/cgroup: Plugin init() callback failed
[2022-08-16T20:52:58.439] error: cannot create jobacct_gather context for
jobacct_gather/cgroup
[2022-08-16T20:52:58.439] fatal: Unable to initialize jobacct_gather

The system has cgroupsv2 enabled as far as I can tell:

# cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory hugetlb pids rdma
# [ $(stat -fc %T /sys/fs/cgroup/) = "cgroup2fs" ] && echo "unified" || ( [
-e /sys/fs/cgroup/unified/ ] && echo "hybrid" || echo "legacy")
unified

And my slurm.conf has:

ProctrackType=proctrack/cgroup
TaskPlugin=task/affinity,task/cgroup

And cgroup.conf:

CgroupAutomount=yes
CgroupPlugin=autodetect

What else should I look for before giving up and reverting to cgroupsv1? My
current version is 22.05.3, but it was happening in 22.05.2 as well.

Thank you for any advice.
-- 
Alan Orth
alan.orth at gmail.com
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220816/2d986b86/attachment.htm>


More information about the slurm-users mailing list