[slurm-users] Slurmd enabled crash with CgroupV2
Brian Andrus
toomuchit at gmail.com
Fri Mar 10 19:08:27 UTC 2023
I'm not sure which specific item to look at, but this seems like a race
condition.
Likely you need to add an override to your slurmd startup
(/etc/systemd/system/slurmd.service/override.conf) and put a dependency
there so it won't start until that is done.
I have mine wait for a few things:
[Unit]
After=autofs.service getty.target sssd.service
That makes it wait for all of those before trying to start.
Brian Andrus
On 3/10/2023 7:41 AM, Tristan LEFEBVRE wrote:
>
> Hello to all,
>
> I'm trying to do an installation of Slurm with cgroupv2 activated.
>
> But I'm facing an odd thing : when slurmd is enabled it crash at the
> next reboot and will never start unless i disable it.
>
> Here is a full example of the situation
>
>
> [root at compute ~]# systemctl start slurmd [root at compute ~]# systemctl
> status slurmd ● slurmd.service - Slurm node daemon Loaded: loaded
> (/usr/lib/systemd/system/slurmd.service; disabled; vendor preset:
> disabled) Active: active (running) since Fri 2023-03-10 15:57:00 CET;
> 967ms ago Main PID: 8053 (slurmd) Tasks: 1 Memory: 3.1M CGroup:
> /system.slice/slurmd.service └─8053 /opt/slurm_bin/sbin/slurmd -D
> --conf-server XXXXX:6817 -s mars 10 15:57:00 compute.cluster.lab
> systemd[1]: Started Slurm node daemon. mars 10 15:57:00
> compute.cluster.lab slurmd[8053]: slurmd: slurmd version 23.02.0
> started mars 10 15:57:00 compute.cluster.lab slurmd[8053]: slurmd:
> slurmd started on Fri, 10 Mar 2023 15:57:00 +0100 mars 10 15:57:00
> compute.cluster.lab slurmd[8053]: slurmd: CPUs=48 Boards=1 Sockets=2
> Cores=24 Threads=1 Memory=385311 TmpDisk=19990 Uptime=12>
> [root at compute ~]# systemctl enable slurmd Created symlink
> /etc/systemd/system/multi-user.target.wants/slurmd.service →
> /usr/lib/systemd/system/slurmd.service.
> [root at compute ~]# reboot now
>
> > [ reboot of the node]
>
> [adm at compute ~]$ sudo systemctl status slurmd ● slurmd.service - Slurm
> node daemon Loaded: loaded (/usr/lib/systemd/system/slurmd.service;
> enabled; vendor preset: disabled) Active: failed (Result: exit-code)
> since Fri 2023-03-10 16:00:33 CET; 1min 0s ago Process: 2659
> ExecStart=/opt/slurm_bin/sbin/slurmd -D --conf-server XXXX:6817 -s
> $SLURMD_OPTIONS (code=exited, status=1/FAILURE) Main PID: 2659
> (code=exited, status=1/FAILURE) mars 10 16:00:33 compute.cluster.lab
> slurmd[2659]: slurmd: slurmd version 23.02.0 started mars 10 16:00:33
> compute.cluster.lab slurmd[2659]: slurmd: error: Controller cpuset is
> not enabled! mars 10 16:00:33 compute.cluster.lab slurmd[2659]:
> slurmd: error: Controller cpu is not enabled! mars 10 16:00:33
> compute.cluster.lab slurmd[2659]: slurmd: error: cpu cgroup controller
> is not available. mars 10 16:00:33 compute.cluster.lab slurmd[2659]:
> slurmd: error: There's an issue initializing memory or cpu controller
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: error:
> Couldn't load specified plugin name for jobacct_gather/cgroup: Plugin
> init()> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd:
> error: cannot create jobacct_gather context for jobacct_gather/cgroup
> mars 10 16:00:33 compute.cluster.lab slurmd[2659]: slurmd: fatal:
> Unable to initialize jobacct_gather mars 10 16:00:33
> compute.cluster.lab systemd[1]: slurmd.service: Main process exited,
> code=exited, status=1/FAILURE mars 10 16:00:33 compute.cluster.lab
> systemd[1]: slurmd.service: Failed with result 'exit-code'.
> [adm at compute ~]$ sudo systemctl start slurmd [adm at compute ~]$ sudo
> systemctl status slurmd ● slurmd.service - Slurm node daemon Loaded:
> loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor
> preset: disabled) Active: failed (Result: exit-code) since Fri
> 2023-03-10 16:01:37 CET; 1s ago Process: 3321
> ExecStart=/opt/slurm_bin/sbin/slurmd -D --conf-server XXXX:6817 -s
> $SLURMD_OPTIONS (code=exited, status=1/FAILURE) Main PID: 3321
> (code=exited, status=1/FAILURE) mars 10 16:01:37 compute.cluster.lab
> slurmd[3321]: slurmd: slurmd version 23.02.0 started mars 10 16:01:37
> compute.cluster.lab slurmd[3321]: slurmd: error: Controller cpuset is
> not enabled! mars 10 16:01:37 compute.cluster.lab slurmd[3321]:
> slurmd: error: Controller cpu is not enabled! mars 10 16:01:37
> compute.cluster.lab slurmd[3321]: slurmd: error: cpu cgroup controller
> is not available. mars 10 16:01:37 compute.cluster.lab slurmd[3321]:
> slurmd: error: There's an issue initializing memory or cpu controller
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: error:
> Couldn't load specified plugin name for jobacct_gather/cgroup: Plugin
> init()> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd:
> error: cannot create jobacct_gather context for jobacct_gather/cgroup
> mars 10 16:01:37 compute.cluster.lab slurmd[3321]: slurmd: fatal:
> Unable to initialize jobacct_gather mars 10 16:01:37
> compute.cluster.lab systemd[1]: slurmd.service: Main process exited,
> code=exited, status=1/FAILURE mars 10 16:01:37 compute.cluster.lab
> systemd[1]: slurmd.service: Failed with result 'exit-code'.
> [adm at compute ~]$ sudo systemctl disable slurmd Removed
> /etc/systemd/system/multi-user.target.wants/slurmd.service.
> [adm at compute ~]$ sudo systemctl start slurmd [adm at compute ~]$ sudo
> systemctl status slurmd ● slurmd.service - Slurm node daemon Loaded:
> loaded (/usr/lib/systemd/system/slurmd.service; disabled; vendor
> preset: disabled) Active: active (running) since Fri 2023-03-10
> 16:01:45 CET; 1s ago Main PID: 3358 (slurmd) Tasks: 1 Memory: 6.1M
> CGroup: /system.slice/slurmd.service └─3358 /opt/slurm_bin/sbin/slurmd
> -D --conf-server XXXX:6817 -s mars 10 16:01:45 compute.cluster.lab
> systemd[1]: Started Slurm node daemon. mars 10 16:01:45
> compute.cluster.lab slurmd[3358]: slurmd: slurmd version 23.02.0
> started mars 10 16:01:45 compute.cluster.lab slurmd[3358]: slurmd:
> slurmd started on Fri, 10 Mar 2023 16:01:45 +0100 mars 10 16:01:45
> compute.cluster.lab slurmd[3358]: slurmd: CPUs=48 Boards=1 Sockets=2
> Cores=24 Threads=1 Memory=385311 TmpDisk=19990 Uptime=84>
>
> As you can see. Slurmd successfully start only when not enable after a
> reboot.
>
> - I'm using Rocky Linux 8 and I've configured cgroupv2 with grubby
>
> > grubby --update-kernel=ALL
> --args="systemd.unified_cgroup_hierarchy=1
> systemd.legacy_systemd_cgroup_controller=0 cgroup_no_v1=all"
>
> - Slurm 23.02 is build with rpmbuild and slurmd on the compute node is
> installed with rpm
>
> - Here is my cgroup.conf :
>
> CgroupPlugin=cgroup/v2 ConstrainCores=yes ConstrainRAMSpace=yes
> ConstrainSwapSpace=yes ConstrainDevices=no
>
> And my slurm.conf have :
>
> ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup,task/affinity
> JobAcctGatherType=jobacct_gather/cgroup
>
>
> - If i do "systemctl start slurmd" on a compute node it's a success.
>
> - If i do "systemctl enable slurmd" and then "systemctl restart
> slurmd" it's still ok
>
> - if i enable and reboot, slurmd send this error :
>
> slurmd: error: Controller cpuset is not enabled! slurmd: error:
> Controller cpu is not enabled! slurmd: error: cpu cgroup controller is
> not available. slurmd: error: There's an issue initializing memory or
> cpu controlle
>
> - I've done some research and read about cgroup.subtree_control. And
> so if i do:
>
> cat /sys/fs/cgroup/cgroup.subtree_control memory pids
>
> So I've tried to follow the RedHat documentation with there example :
> ( the link of the RedHat page here
> <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/using-cgroups-v2-to-control-distribution-of-cpu-time-for-applications_managing-monitoring-and-updating-the-kernel>)
>
> |echo "+cpu" >> /sys/fs/cgroup/cgroup.subtree_control echo "+cpuset"
> >> /sys/fs/cgroup/cgroup.subtree_control cat
> /sys/fs/cgroup/cgroup.subtree_control cpuset cpu memory pids |
>
> And indeed i can restart slurmd.
>
> But at the next boot it failed again and
> /sys/fs/cgroup/cgroup.subtree_control is back with "memory pids" only.
>
> And strangely i found if slurmd is enabled and then i disable it, it
> change the value of /sys/fs/cgroup/cgroup.subtree_control :
>
> [root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control memory
> pids [root at compute ~]# systemctl disable slurmd Removed
> /etc/systemd/system/multi-user.target.wants/slurmd.service.
> [root at compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control cpuset cpu
> io memory pids
>
>
> I've made a script at launch time as a dirty fix by using ExecPreStart
> in slurmd.service:
>
> ExecStartPre=/opt/slurm_bin/dirty_fix_slurmd.sh
>
> with dirty_fix_slurmd.sh:
>
> #!/bin//bash
> echo "+cpu" >> /sys/fs/cgroup/cgroup.subtree_control
> echo "+cpuset" >> /sys/fs/cgroup/cgroup.subtree_control
> echo "+cpu" >> /sys/fs/cgroup/system.slice/cgroup.subtree_control
> echo "+cpuset" >> /sys/fs/cgroup/system.slice/cgroup.subtree_control
>
> (And i'm not sure if this is something good to do ?)
>
>
> If you have an idea how to correct this situation
>
> Have a nice day
>
> Thank you
>
> Tristan LEFEBVRE
>
> CONFIDENTIALITE : ce courriel et les éventuelles pièces attachées sont
> la propriété de l’IRT Jules Verne, sont confidentiels et sont réservés
> à l’usage de la ou des personne(s) identifées(s) comme
> destinataire(s). Si vous avez reçu ce courriel par erreur, toute
> utilisation, divulgation, ou copie de ce courriel est interdite. Dans
> ce cas, merci d’en informer immédiatement l'expéditeur et de supprimer
> le courriel et ses pièces jointes.
> CONFIDENTIALITY : This e-mail and any attachments are IRT Jules
> Verne’s property and are intended solely for the person or entity to
> whom it is addressed, and may contain confidential or privileged
> information. Should you have received this e-mail in error, any use,
> disclosure, or copy of this email is prohibited. In this case, please
> inform the sender immediately and delete this email and its attachments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230310/4083f4ff/attachment-0001.htm>
More information about the slurm-users
mailing list