[slurm-users] Slurmd enabled crash with CgroupV2

Alan Orth alan.orth at gmail.com
Tue May 23 10:46:21 UTC 2023


I notice the exact same behavior as Tristan. My CentOS Stream 8 system is
in full unified cgroupv2 mode, the slurmd.service has a "Delegate=Yes"
override added to it, and all cgroup stuff is added to slurm.conf and
cgroup.conf, yet slurmd does not start after reboot. I don't understand
what is happening, but I see the exact same behavior regarding the cgroup
subtree_control with disabling / re-enabling slurmd.

[root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control


memory pids


[root at compute ~]# systemctl disable slurmd
Removed /etc/systemd/system/multi-user.target.wants/slurmd.service.
[root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control
cpuset cpu io memory pids
[root at compute ~]# systemctl enable slurmd
Created symlink /etc/systemd/system/multi-user.target.wants/slurmd.service
→ /usr/lib/systemd/system/slurmd.service.
[root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control
cpuset cpu io memory pids

After this slurmd starts up successfully (until the next reboot). We are on
version 22.05.9.

Regards,



On Fri, Mar 10, 2023 at 10:10 PM Brian Andrus <toomuchit at gmail.com> wrote:

> I'm not sure which specific item to look at, but this seems like a race
> condition.
> Likely you need to add an override to your slurmd startup
> (/etc/systemd/system/slurmd.service/override.conf) and put a dependency
> there so it won't start until that is done.
>
> I have mine wait for a few things:
>
> [Unit]
> After=autofs.service getty.target sssd.service
>
>
> That makes it wait for all of those before trying to start.
>
> Brian Andrus
> On 3/10/2023 7:41 AM, Tristan LEFEBVRE wrote:
>
> Hello to all,
>
> I'm trying to do an installation of Slurm with cgroupv2 activated.
>
> But I'm facing an odd thing : when slurmd is enabled it crash at the next
> reboot and will never start unless i disable it.
>
> Here is a full example of the situation
>
>
> [root at compute ~]# systemctl start slurmd
> [root at compute ~]# systemctl status slurmd
> ● slurmd.service - Slurm node daemon
>    Loaded: loaded (/usr/lib/systemd/system/slurmd.service; disabled; vendor preset: disabled)
>    Active: active (running) since Fri 2023-03-10 15:57:00 CET; 967ms ago
>  Main PID: 8053 (slurmd)
>     Tasks: 1
>    Memory: 3.1M
>    CGroup: /system.slice/slurmd.service
>            └─8053 /opt/slurm_bin/sbin/slurmd -D --conf-server XXXXX:6817 -s
>
> mars 10 15:57:00 compute.cluster.lab systemd[1]: Started Slurm node daemon.
> mars 10 15:57:00 compute.cluster.lab slurmd[8053]: slurmd: slurmd version 23.02.0 started
> mars 10 15:57:00 compute.cluster.lab slurmd[8053]: slurmd: slurmd started on Fri, 10 Mar 2023 15:57:00 +0100
> mars 10 15:57:00 compute.cluster.lab slurmd[8053]: slurmd: CPUs=48 Boards=1 Sockets=2 Cores=24 Threads=1 Memory=385311 TmpDisk=19990 Uptime=12>
>
> [root at compute ~]# systemctl enable slurmd
> Created symlink /etc/systemd/system/multi-user.target.wants/slurmd.service → /usr/lib/systemd/system/slurmd.service.[root at compute ~]#  reboot now
>
> > [ reboot of the node]
>
> [adm at compute ~]$ sudo systemctl status slurmd
> ● slurmd.service - Slurm node daemon
>    Loaded: loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor preset: disabled)
>    Active: failed (Result: exit-code) since Fri 2023-03-10 16:00:33 CET; 1min 0s ago
>   Process: 2659 ExecStart=/opt/slurm_bin/sbin/slurmd -D --conf-server XXXX:6817 -s $SLURMD_OPTIONS (code=exited, status=1/FAILURE)
>  Main PID: 2659 (code=exited, status=1/FAILURE)
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: slurmd version 23.02.0 started
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error: Controller cpuset is not enabled!
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error: Controller cpu is not enabled!
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error: cpu cgroup controller is not available.
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error: There's an issue initializing memory or cpu controller
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error: Couldn't load specified plugin name for jobacct_gather/cgroup: Plugin init()>
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error: cannot create jobacct_gather context for jobacct_gather/cgroup
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: fatal: Unable to initialize jobacct_gather
> mars 10 16:00:33 compute.cluster.lab systemd[1]: slurmd.service: Main process exited, code=exited, status=1/FAILURE
> mars 10 16:00:33 compute.cluster.lab systemd[1]: slurmd.service: Failed with result 'exit-code'.
>
> [adm at compute ~]$ sudo systemctl start slurmd
> [adm at compute ~]$ sudo systemctl status slurmd
> ● slurmd.service - Slurm node daemon
>    Loaded: loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor preset: disabled)
>    Active: failed (Result: exit-code) since Fri 2023-03-10 16:01:37 CET; 1s ago
>   Process: 3321 ExecStart=/opt/slurm_bin/sbin/slurmd -D --conf-server XXXX:6817 -s $SLURMD_OPTIONS (code=exited, status=1/FAILURE)
>  Main PID: 3321 (code=exited, status=1/FAILURE)
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: slurmd version 23.02.0 started
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error: Controller cpuset is not enabled!
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error: Controller cpu is not enabled!
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error: cpu cgroup controller is not available.
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error: There's an issue initializing memory or cpu controller
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error: Couldn't load specified plugin name for jobacct_gather/cgroup: Plugin init()>
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error: cannot create jobacct_gather context for jobacct_gather/cgroup
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: fatal: Unable to initialize jobacct_gather
> mars 10 16:01:37 compute.cluster.lab systemd[1]: slurmd.service: Main process exited, code=exited, status=1/FAILURE
> mars 10 16:01:37 compute.cluster.lab systemd[1]: slurmd.service: Failed with result 'exit-code'.
>
> [adm at compute ~]$ sudo systemctl disable slurmd
> Removed /etc/systemd/system/multi-user.target.wants/slurmd.service.
> [adm at compute ~]$ sudo systemctl start slurmd
> [adm at compute ~]$ sudo systemctl status slurmd
> ● slurmd.service - Slurm node daemon
>    Loaded: loaded (/usr/lib/systemd/system/slurmd.service; disabled; vendor preset: disabled)
>    Active: active (running) since Fri 2023-03-10 16:01:45 CET; 1s ago
>  Main PID: 3358 (slurmd)
>     Tasks: 1
>    Memory: 6.1M
>    CGroup: /system.slice/slurmd.service
>            └─3358 /opt/slurm_bin/sbin/slurmd -D --conf-server XXXX:6817 -s
> mars 10 16:01:45 compute.cluster.lab systemd[1]: Started Slurm node daemon.
> mars 10 16:01:45 compute.cluster.lab slurmd[3358]: slurmd: slurmd version 23.02.0 started
> mars 10 16:01:45 compute.cluster.lab slurmd[3358]: slurmd: slurmd started on Fri, 10 Mar 2023 16:01:45 +0100
> mars 10 16:01:45 compute.cluster.lab slurmd[3358]: slurmd: CPUs=48 Boards=1 Sockets=2 Cores=24 Threads=1 Memory=385311 TmpDisk=19990 Uptime=84>
>
> As you can see. Slurmd successfully start only when not enable after a
> reboot.
>
> - I'm using Rocky Linux 8  and I've configured cgroupv2 with grubby
>
> > grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1 systemd.legacy_systemd_cgroup_controller=0 cgroup_no_v1=all"
>
> - Slurm 23.02 is build with rpmbuild and slurmd on the compute node is
> installed with rpm
>
> - Here is my cgroup.conf :
>
> CgroupPlugin=cgroup/v2
> ConstrainCores=yes
> ConstrainRAMSpace=yes
> ConstrainSwapSpace=yes
> ConstrainDevices=no
>
> And my slurm.conf have :
>
> ProctrackType=proctrack/cgroup
> TaskPlugin=task/cgroup,task/affinity
> JobAcctGatherType=jobacct_gather/cgroup
>
>
> - If i do "systemctl start slurmd" on a compute node it's a success.
>
> - If i do "systemctl enable slurmd" and then "systemctl restart slurmd"
> it's still ok
>
> - if i enable and reboot, slurmd send this error :
>
> slurmd: error: Controller cpuset is not enabled!
> slurmd: error: Controller cpu is not enabled!
> slurmd: error: cpu cgroup controller is not available.
> slurmd: error: There's an issue initializing memory or cpu controlle
>
> -  I've done some research and read about cgroup.subtree_control. And so
> if i do:
>
> cat /sys/fs/cgroup/cgroup.subtree_control
> memory pids
>
> So I've tried to follow the RedHat documentation with there example : (
> the link of the RedHat page here
> <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/using-cgroups-v2-to-control-distribution-of-cpu-time-for-applications_managing-monitoring-and-updating-the-kernel>
> )
>
> echo "+cpu" >> /sys/fs/cgroup/cgroup.subtree_control
> echo "+cpuset" >> /sys/fs/cgroup/cgroup.subtree_control
> cat /sys/fs/cgroup/cgroup.subtree_control
> cpuset cpu memory pids
>
> And indeed i can restart slurmd.
>
> But at the next boot it failed again  and
> /sys/fs/cgroup/cgroup.subtree_control is back with "memory pids" only.
>
> And strangely i found if slurmd is enabled and then i disable it, it
> change the value of /sys/fs/cgroup/cgroup.subtree_control :
>
> [root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control
> memory pids
> [root at compute ~]# systemctl disable slurmd
> Removed /etc/systemd/system/multi-user.target.wants/slurmd.service.
> [root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control
> cpuset cpu io memory pids
>
>
> I've made a script at launch time as a dirty fix by using ExecPreStart in
> slurmd.service:
>
> ExecStartPre=/opt/slurm_bin/dirty_fix_slurmd.sh
>
> with dirty_fix_slurmd.sh:
>
> #!/bin//bash
> echo "+cpu" >> /sys/fs/cgroup/cgroup.subtree_control
> echo "+cpuset" >> /sys/fs/cgroup/cgroup.subtree_control
> echo "+cpu" >> /sys/fs/cgroup/system.slice/cgroup.subtree_control
> echo "+cpuset" >> /sys/fs/cgroup/system.slice/cgroup.subtree_control
>
>
> (And i'm not sure if this is something good to do ?)
>
>
> If you have an idea how to correct this situation
>
> Have a nice day
>
> Thank you
>
> Tristan LEFEBVRE
> CONFIDENTIALITE : ce courriel et les éventuelles pièces attachées sont la
> propriété de l’IRT Jules Verne, sont confidentiels et sont réservés à
> l’usage de la ou des personne(s) identifées(s) comme destinataire(s). Si
> vous avez reçu ce courriel par erreur, toute utilisation, divulgation, ou
> copie de ce courriel est interdite. Dans ce cas, merci d’en informer
> immédiatement l'expéditeur et de supprimer le courriel et ses pièces
> jointes.
> CONFIDENTIALITY : This e-mail and any attachments are IRT Jules Verne’s
> property and are intended solely for the person or entity to whom it is
> addressed, and may contain confidential or privileged information. Should
> you have received this e-mail in error, any use, disclosure, or copy of
> this email is prohibited. In this case, please inform the sender
> immediately and delete this email and its attachments.
>
>

-- 
Alan Orth
alan.orth at gmail.com
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230523/20b8aa28/attachment-0001.htm>


More information about the slurm-users mailing list