<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Hans,</p>
<p><br>
</p>
<p>We run Slurm in k8s at the ETH Zurich to manage physical compute
nodes. The link you include and Nicolas's followup already contain
the basics.</p>
<p><br>
</p>
<p>We build several Docker containers based on CentOS 7 (for now)
with Slurm compiled from source for the following services:<br>
</p>
<ul>
<li>slurmdbd</li>
<li>slurmctld</li>
<li>slurmd (used for testing as “containerized nodes”)</li>
</ul>
All these containers include an sssd daemon that interfaces with the
central LDAP though we are looking at ways to streamline this part.<br>
<p>We use several helper containers, such as mariadb, a prometheus
exporter, a file server for the code and configuration (used to
transfer these to the physical nodes), and a controller that
configures users, accounts, QOS, … into Slurm.</p>
<p><br>
</p>
<p>PVCs hosted on an NFS appliance provide data persistence.<br>
</p>
<p><br>
</p>
<p>A Helm chart is used to for deploying to a local test k8s
instance, a test/staging cluster, and the production cluster. The
chart and containers are site specific but I am happy to share the
relevant code & config with you if you contact me by PM.<br>
</p>
<br>
<p>With kind regards,<br>
</p>
<p>Urban<br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 2022-11-14 09:42, Viessmann
Hans-Nikolai (PSI) wrote:<br>
</div>
<blockquote type="cite"
cite="mid:GV0P278MB0033013BB0235A7259715853B3059@GV0P278MB0033.CHEP278.PROD.OUTLOOK.COM">
<pre class="moz-quote-pre" wrap="">Good Morning,
I'm working on a project at work to run SLURM cluster management components
(slurmctld and slurmdbd) as K8s pods, which manage a cluster of physical compute
nodes. I've come upon a few discussions of doing this (or more generally running
SLURM in containers); I especially found this one
(see <a class="moz-txt-link-freetext" href="https://groups.google.com/g/slurm-users/c/uevFWPHHr2U/m/fkwusc0JDwAJ">https://groups.google.com/g/slurm-users/c/uevFWPHHr2U/m/fkwusc0JDwAJ</a>)
very helpful.
Are there any further details or advice anyone has on such a setup?
Thank you and kind regards,
Hans
---------------------------------------------------------------------------------------------
Paul Scherrer Institut
Hans-Nikolai Viessmann
High Performance Computing & Emerging Technologies
Building/Room: OHSA/D02
Forschungsstrasse 111
5232 Villigen PSI
Switzerland
Telephone: +41 56 310 41 24
E-Mail: <a class="moz-txt-link-abbreviated" href="mailto:hans-nikolai.viessmann@psi.ch">hans-nikolai.viessmann@psi.ch</a>
GPG: 46F7 826E 80E1 EE45 2DCA 1BFC A39B E4B6 EA0C E4C4
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
ETH Zurich, Dr. Urban Borštnik
High Performance Computing, Scientific IT Services
OCT G35, Binzmühlestrasse 130, 8092 Zurich, Switzerland
Phone +41 44 632 3512, <a class="moz-txt-link-freetext" href="http://www.id.ethz.ch/">http://www.id.ethz.ch/</a>
<a class="moz-txt-link-abbreviated" href="mailto:urban.borstnik@id.ethz.ch">urban.borstnik@id.ethz.ch</a></pre>
</body>
</html>