<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi,</p>
<p>I just want to wrap this up in case someone has the same issue in
the future.</p>
<p>As Reed pointed out, Ubuntu 22 does not support cgroups v1
anymore. At the same time, the slurm-wlm package in the Ubuntu
repositories uses cgroups v1, which makes its task/cgroup plugin
incompatible with Ubuntu 22.</p>
<p>My solution was to build Slurm 22.05 manually, while ensuring
that <i>libdbus-1-dev</i> is installed (as otherwise cgroups v2
support does not get built). This takes a bit more time but seems
to work so far.</p>
<p>Thanks a lot Reed & Abel for your advice!</p>
<p>Best,</p>
<p>Tim<br>
</p>
<div class="moz-cite-prefix">On 6/16/23 10:42, Tim Schneider wrote:<br>
</div>
<blockquote type="cite"
cite="mid:bce411cc-c8fd-b197-fb81-f1861850c04d@gmail.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<p>Hi again,</p>
<p>I just realized that <a
href="https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1"
class="moz-txt-link-freetext" moz-do-not-send="true">https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1</a>
wrote at some point that he build Slurm 22 instead of using the
Ubuntu repo version. So I guess I will have to look into that.</p>
<p>Best,</p>
<p>Tim</p>
<div class="moz-cite-prefix">On 6/16/23 10:36, Tim Schneider
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:cba7b791-a57a-a8c1-9278-233aae2664fe@gmail.com">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
<p>Hi Abel and Reed,</p>
<p>thanks a lot for your quick replies!</p>
<p>I did indeed just install slurm-wlm from the Ubuntu repos.<br>
</p>
<p>Following the advice of <a
href="https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1"
class="moz-txt-link-freetext" moz-do-not-send="true">https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1</a>,
I tried disabling cgroups v1 on Ubuntu, but that just leads to
an error during startup of slurmd:<br>
</p>
<p><i>slurmd: debug3: Trying to load plugin
/usr/lib/x86_64-linux-gnu/slurm-wlm/proctrack_cgroup.so</i><i><br>
</i><i>slurmd: error: unable to mount freezer cgroup
namespace: Invalid argument</i><i><br>
</i><i>slurmd: error: unable to create freezer cgroup
namespace</i><i><br>
</i><i>slurmd: error: Couldn't load specified plugin name for
proctrack/cgroup: Plugin init() callback failed</i><i><br>
</i><i>slurmd: error: cannot create proctrack context for
proctrack/cgroup</i><i><br>
</i><i>slurmd: error: slurmd initialization failed</i></p>
<p>So it seems that slurmd is using cgroups v1. This is also
reflected in the mounts (for the output below, cgroups v1 is
enabled again):</p>
<p><i>$ mount | grep cgroup</i><i><br>
</i><i>cgroup2 on /sys/fs/cgroup type cgroup2
(rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)</i><i><br>
</i><i>cgroup on /sys/fs/cgroup/freezer type cgroup
(rw,nosuid,nodev,noexec,relatime,freezer)</i></p>
<p>What is still confusing to me is that the slurmd logs
indicate no error when I try running with cgroups v1 enabled
and the error only appears on the slurmctld side.</p>
<p>Do you know how I can enable cgroups v2 in Slurm? To me it
seems that this is what <a
href="https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1"
class="moz-txt-link-freetext" moz-do-not-send="true">https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1</a>
did.</p>
<p>Best,</p>
<p>Tim<br>
</p>
<div class="moz-cite-prefix">On 6/16/23 03:28, abel pinto wrote:<br>
</div>
<blockquote type="cite"
cite="mid:2be418c2771243088a515d1b9ef490dc@TU-EX070.ads.tu-darmstadt.de">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
Indeed, the issue seems to be that Ubuntu 22.04 does not
support cgroups v1 anymore. Does SLURM support cgroupsv2? It
seems so: <a href="https://slurm.schedmd.com/cgroup_v2.html"
moz-do-not-send="true" class="moz-txt-link-freetext">https://slurm.schedmd.com/cgroup_v2.html</a><br>
<br>
<div dir="ltr">/Abel</div>
<div dir="ltr"><br>
<blockquote type="cite">On Jun 15, 2023, at 20:20, Reed Dier
<a class="moz-txt-link-rfc2396E"
href="mailto:reed.dier@focusvq.com"
moz-do-not-send="true"><reed.dier@focusvq.com></a>
wrote:<br>
<br>
</blockquote>
</div>
<blockquote type="cite">
<div dir="ltr">I don’t have any direct advice off-hand, but
I figure I will try to help steer the conversation in the
right direction for figuring it out.
<div class=""><br class="">
</div>
<div class="">I’m going to assume that since you mention
21.08.5, that this means you are using the slurm-wlm
packages from the ubuntu repos, and not building
yourself?</div>
<div class=""><br class="">
</div>
<div class="">And have all the components (slurmctld(s),
slurmdbd, slurmd(s)) been upgraded as well?</div>
<div class=""><br class="">
</div>
<div class="">
<div>The only thing that immediately comes to mind is
that I remember reading a good bit about Ubuntu
22.04’s use of cgroups v2, which as I understand it
are very different from cgroups v1, and plenty of
people have had issues with v1/v2 mismatches with
slurm and other applications.</div>
<div><br class="">
</div>
<div><a
href="https://www.reddit.com/r/SLURM/comments/vjquih/error_cannot_find_cgroup_plugin_for_cgroupv2/"
class="moz-txt-link-freetext" moz-do-not-send="true">https://www.reddit.com/r/SLURM/comments/vjquih/error_cannot_find_cgroup_plugin_for_cgroupv2/</a></div>
<div><a
href="https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1"
class="moz-txt-link-freetext" moz-do-not-send="true">https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1</a></div>
<div><a
href="https://discuss.linuxcontainers.org/t/after-updated-to-more-recent-ubuntu-version-with-cgroups-v2-ubuntu-16-04-container-is-not-working-properly/14022"
class="moz-txt-link-freetext" moz-do-not-send="true">https://discuss.linuxcontainers.org/t/after-updated-to-more-recent-ubuntu-version-with-cgroups-v2-ubuntu-16-04-container-is-not-working-properly/14022</a></div>
<div><br class="">
</div>
<div>Hope that at least steers the conversation in a
good direction.</div>
<div><br class="">
</div>
<div>Reed</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Jun 15, 2023, at 5:04 PM, Tim
Schneider <<a
href="mailto:tim.schneider1@tu-darmstadt.de"
class="moz-txt-link-freetext"
moz-do-not-send="true">tim.schneider1@tu-darmstadt.de</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">Hi,
<div class="moz-forward-container">
<p class="">I am maintaining the SLURM cluster
of my research group. Recently I updated to
Ubuntu 22.04 and Slurm 21.08.5 and ever
since, I am unable to launch jobs. When
launching a job, I receive the following
error:</p>
<p class=""><i class="">$ srun --nodes=1
--ntasks-per-node=1 -c 1 --mem-per-cpu 1G
--time=01:00:00 --pty -p amd -w cn02 --pty
bash -i</i><i class=""><br class="">
</i><i class="">srun: error: task 0 launch
failed: Plugin initialization failed</i></p>
<p class="">Strangely, I cannot find any
indication of this problem in the logs (find
the logs attached). The problem must be
related to the task/cgroup plugin, as it
does not occur when I disable it.</p>
<p class="">After reading in the
documentation, I tried adding the <i
class="">cgroup_enable=memory
swapaccount=1</i> kernel parameters, but
the problem persisted.</p>
<p class="">I would be very grateful for any
advice where to look since I have no idea
how to investigate this issue further.</p>
<p class="">Thanks a lot in advance.</p>
<p class="">Best,</p>
<p class="">Tim<br class="">
</p>
<p class=""><br class="">
</p>
</div>
</div>
<span
id="cid:20813A19-49FA-477E-A30F-A1870FA53F1A"><cgroup.conf></span><span
id="cid:6EA7789A-0BBC-46F4-A871-21FF5344DAA2"><slurmd.log></span><span
id="cid:1BEA19F6-B8FD-4E05-96A4-E00AB43D54F6"><slurmctld.log></span></div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</body>
</html>