[slurm-users] Problem with cgroup plugin in Ubuntu22.04 and slurm 21.08.5
Angel de Vicente
angel.de.vicente at iac.es
Thu Sep 7 18:09:23 UTC 2023
Cristóbal Navarro <cristobal.navarro.g at gmail.com> writes:
> Hello Angel and Community,
> I am facing a similar problem with a DGX A100 with DGX OS 6 (Based on
> Ubuntu 22.04 LTS) and Slurm 23.02.
> When I execute `slurmd` service, it status shows failed with the
> following information below.
> As of today, what is the best solution to this problem? I am really
> not sure if the DGX A100 could fail or not by disabling cgroups v1.
> Any suggestions are welcome.
did you manage to find a solution to this without disabling cgroups v1?
In our case:
| slurm 23.02.3
| Ubuntu 22.04.3 LTS
| # cat /proc/cmdline
| BOOT_IMAGE=/boot/vmlinuz-5.15.0-83-generic root=UUID=... ro quiet splash cgroup_no_v1=all vt.handoff=7
disabling cgroups v1 has been working reliably, but it would be nice to
find a solution that doesn't require modifying the kernel parameters.
Ángel de Vicente
Research Software Engineer (Supercomputing and BigData)
Tel.: +34 922-605-747
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5877 bytes
Desc: not available
More information about the slurm-users