[slurm-users] SLURM in K8s, any advice?

Urban Borštnik urban.borstnik at id.ethz.ch
Wed Nov 16 15:45:56 UTC 2022


Hi Hans,


We run Slurm in k8s at the ETH Zurich to manage physical compute nodes. 
The link you include and Nicolas's followup already contain the basics.


We build several Docker containers based on CentOS 7 (for now) with 
Slurm compiled from source for the following services:

  * slurmdbd
  * slurmctld
  * slurmd (used for testing as “containerized nodes”)

All these containers include an sssd daemon that interfaces with the 
central LDAP though we are looking at ways to streamline this part.

We use several helper containers, such as mariadb, a prometheus 
exporter, a file server for the code and configuration (used to transfer 
these to the physical nodes), and a controller that configures users, 
accounts, QOS, … into Slurm.


PVCs hosted on an NFS appliance provide data persistence.


A Helm chart is used to for deploying to a local test k8s instance, a 
test/staging cluster, and the production cluster. The chart and 
containers are site specific but I am happy to share the relevant code & 
config with you if you contact me by PM.


With kind regards,

Urban


On 2022-11-14 09:42, Viessmann Hans-Nikolai (PSI) wrote:
> Good Morning,
>
> I'm working on a project at work to run SLURM cluster management components
> (slurmctld and slurmdbd) as K8s pods, which manage a cluster of physical compute
> nodes. I've come upon a few discussions of doing this (or more generally running
> SLURM in containers); I especially found this one
> (seehttps://groups.google.com/g/slurm-users/c/uevFWPHHr2U/m/fkwusc0JDwAJ)
> very helpful.
>
> Are there any further details or advice anyone has on such a setup?
>
> Thank you and kind regards,
> Hans
>
> ---------------------------------------------------------------------------------------------
> Paul Scherrer Institut
> Hans-Nikolai Viessmann
> High Performance Computing & Emerging Technologies
> Building/Room: OHSA/D02
> Forschungsstrasse 111
> 5232 Villigen PSI
> Switzerland
>
> Telephone: +41 56 310 41 24
> E-Mail:hans-nikolai.viessmann at psi.ch
> GPG: 46F7 826E 80E1 EE45 2DCA 1BFC A39B E4B6 EA0C E4C4

-- 
ETH Zurich, Dr. Urban Borštnik
High Performance Computing, Scientific IT Services
OCT G35, Binzmühlestrasse 130, 8092 Zurich, Switzerland
Phone +41 44 632 3512,http://www.id.ethz.ch/
urban.borstnik at id.ethz.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20221116/1719db87/attachment.htm>


More information about the slurm-users mailing list