[slurm-users] task/cgroup plugin causes "srun: error: task 0 launch failed: Plugin initialization failed" error on Ubuntu 22.04

abel pinto abelpc_uff at yahoo.com.br
Fri Jun 16 01:28:26 UTC 2023


Indeed, the issue seems to be that Ubuntu 22.04 does not support cgroups v1 anymore. Does SLURM support cgroupsv2? It seems so: https://slurm.schedmd.com/cgroup_v2.html

/Abel

> On Jun 15, 2023, at 20:20, Reed Dier <reed.dier at focusvq.com> wrote:
> 
> I don’t have any direct advice off-hand, but I figure I will try to help steer the conversation in the right direction for figuring it out.
> 
> I’m going to assume that since you mention 21.08.5, that this means you are using the slurm-wlm packages from the ubuntu repos, and not building yourself?
> 
> And have all the components (slurmctld(s), slurmdbd, slurmd(s)) been upgraded as well?
> 
> The only thing that immediately comes to mind is that I remember reading a good bit about Ubuntu 22.04’s use of cgroups v2, which as I understand it are very different from cgroups v1, and plenty of people have had issues with v1/v2 mismatches with slurm and other applications.
> 
> https://www.reddit.com/r/SLURM/comments/vjquih/error_cannot_find_cgroup_plugin_for_cgroupv2/
> https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1
> https://discuss.linuxcontainers.org/t/after-updated-to-more-recent-ubuntu-version-with-cgroups-v2-ubuntu-16-04-container-is-not-working-properly/14022
> 
> Hope that at least steers the conversation in a good direction.
> 
> Reed
> 
>> On Jun 15, 2023, at 5:04 PM, Tim Schneider <tim.schneider1 at tu-darmstadt.de> wrote:
>> 
>> Hi,
>> I am maintaining the SLURM cluster of my research group. Recently I updated to Ubuntu 22.04 and Slurm 21.08.5 and ever since, I am unable to launch jobs. When launching a job, I receive the following error:
>> 
>> $ srun --nodes=1 --ntasks-per-node=1 -c 1 --mem-per-cpu 1G --time=01:00:00 --pty -p amd -w cn02 --pty bash -i
>> srun: error: task 0 launch failed: Plugin initialization failed
>> 
>> Strangely, I cannot find any indication of this problem in the logs (find the logs attached). The problem must be related to the task/cgroup plugin, as it does not occur when I disable it.
>> 
>> After reading in the documentation, I tried adding the cgroup_enable=memory swapaccount=1 kernel parameters, but the problem persisted.
>> 
>> I would be very grateful for any advice where to look since I have no idea how to investigate this issue further.
>> 
>> Thanks a lot in advance.
>> 
>> Best,
>> 
>> Tim
>> 
>> 
>> 
>> <cgroup.conf><slurmd.log><slurmctld.log>
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230615/01fddbeb/attachment-0001.htm>


More information about the slurm-users mailing list