[slurm-users] task/cgroup plugin causes "srun: error: task 0 launch failed: Plugin initialization failed" error on Ubuntu 22.04

Reed Dier reed.dier at focusvq.com
Fri Jun 16 00:12:04 UTC 2023


I don’t have any direct advice off-hand, but I figure I will try to help steer the conversation in the right direction for figuring it out.

I’m going to assume that since you mention 21.08.5, that this means you are using the slurm-wlm packages from the ubuntu repos, and not building yourself?

And have all the components (slurmctld(s), slurmdbd, slurmd(s)) been upgraded as well?

The only thing that immediately comes to mind is that I remember reading a good bit about Ubuntu 22.04’s use of cgroups v2, which as I understand it are very different from cgroups v1, and plenty of people have had issues with v1/v2 mismatches with slurm and other applications.

https://www.reddit.com/r/SLURM/comments/vjquih/error_cannot_find_cgroup_plugin_for_cgroupv2/
https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1
https://discuss.linuxcontainers.org/t/after-updated-to-more-recent-ubuntu-version-with-cgroups-v2-ubuntu-16-04-container-is-not-working-properly/14022

Hope that at least steers the conversation in a good direction.

Reed

> On Jun 15, 2023, at 5:04 PM, Tim Schneider <tim.schneider1 at tu-darmstadt.de> wrote:
> 
> Hi,
> I am maintaining the SLURM cluster of my research group. Recently I updated to Ubuntu 22.04 and Slurm 21.08.5 and ever since, I am unable to launch jobs. When launching a job, I receive the following error:
> 
> $ srun --nodes=1 --ntasks-per-node=1 -c 1 --mem-per-cpu 1G --time=01:00:00 --pty -p amd -w cn02 --pty bash -i
> srun: error: task 0 launch failed: Plugin initialization failed
> 
> Strangely, I cannot find any indication of this problem in the logs (find the logs attached). The problem must be related to the task/cgroup plugin, as it does not occur when I disable it.
> 
> After reading in the documentation, I tried adding the cgroup_enable=memory swapaccount=1 kernel parameters, but the problem persisted.
> 
> I would be very grateful for any advice where to look since I have no idea how to investigate this issue further.
> 
> Thanks a lot in advance.
> 
> Best,
> 
> Tim
> 
> 
> 
> <cgroup.conf><slurmd.log><slurmctld.log>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230615/84bb2a41/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3857 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230615/84bb2a41/attachment.bin>


More information about the slurm-users mailing list