[slurm-users] Problems with cgroupsv2

Alan Orth alan.orth at gmail.com
Tue Aug 16 20:36:45 UTC 2022


I re-installed SLURM 22.05.3 and then restarted slurmd and now it's working:

# dnf reinstall slurm slurm-slurmd slurm-devel slurm-pam_slurm
# systemctl restart slurmd

The dnf.log shows that the versions were the same, so there was no mismatch
or anything:

2022-08-16T23:29:02+0300 DEBUG Reinstalled: slurm-22.05.3-1.el8.x86_64
2022-08-16T23:29:02+0300 DEBUG Reinstalled: slurm-devel-22.05.3-1.el8.x86_64
2022-08-16T23:29:02+0300 DEBUG Reinstalled:
slurm-pam_slurm-22.05.3-1.el8.x86_64
2022-08-16T23:29:02+0300 DEBUG Reinstalled:
slurm-slurmd-22.05.3-1.el8.x86_64

So I'm not sure what's going on... anyways, at least it's working now!

Regards,

On Tue, Aug 16, 2022 at 12:53 PM Alan Orth <alan.orth at gmail.com> wrote:

> Dear list,
>
> I've been using cgroupsv2 with SLURM 22.05 on CentOS Stream 8 successfully
> for a few months now. Recently a few of my nodes have started having
> problems starting slurmd. The log shows:
>
> [2022-08-16T20:52:58.439] slurmd version 22.05.3 started
> [2022-08-16T20:52:58.439] error: Controller cpuset is not enabled!
> [2022-08-16T20:52:58.439] error: Controller cpu is not enabled!
> [2022-08-16T20:52:58.439] error: cpu cgroup controller is not available.
> [2022-08-16T20:52:58.439] error: There's an issue initializing memory or
> cpu controller
> [2022-08-16T20:52:58.439] error: Couldn't load specified plugin name for
> jobacct_gather/cgroup: Plugin init() callback failed
> [2022-08-16T20:52:58.439] error: cannot create jobacct_gather context for
> jobacct_gather/cgroup
> [2022-08-16T20:52:58.439] fatal: Unable to initialize jobacct_gather
>
> The system has cgroupsv2 enabled as far as I can tell:
>
> # cat /sys/fs/cgroup/cgroup.controllers
> cpuset cpu io memory hugetlb pids rdma
> # [ $(stat -fc %T /sys/fs/cgroup/) = "cgroup2fs" ] && echo "unified" || (
> [ -e /sys/fs/cgroup/unified/ ] && echo "hybrid" || echo "legacy")
> unified
>
> And my slurm.conf has:
>
> ProctrackType=proctrack/cgroup
> TaskPlugin=task/affinity,task/cgroup
>
> And cgroup.conf:
>
> CgroupAutomount=yes
> CgroupPlugin=autodetect
>
> What else should I look for before giving up and reverting to cgroupsv1?
> My current version is 22.05.3, but it was happening in 22.05.2 as well.
>
> Thank you for any advice.
> --
> Alan Orth
> alan.orth at gmail.com
> https://picturingjordan.com
> https://englishbulgaria.net
> https://mjanja.ch
>


-- 
Alan Orth
alan.orth at gmail.com
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220816/0f4428ff/attachment.htm>


More information about the slurm-users mailing list