There is a kubeflow offering that might be of interest:
https://www.dkube.io/post/mlops-on-hpc-slurm-with-kubeflow
I have not tried it myself, no idea how well it works.
Regards,
--Dani_L.
Bright Cluster Manager has some verbiage on their marketing site that they can manage a cluster running both Kubernetes and Slurm. Maybe I misunderstood it. But nevertheless, I am encountering groups more frequently that want to run a stack of containers that need private container networking.
What’s the current state of using the same HPC cluster for both Slurm and Kube?
Note: I’m aware that I can run Kube on a single node, but we need more resources. So ultimately we need a way to have Slurm and Kube exist in the same cluster, both sharing the full amount of resources and both being fully aware of resource usage.
Thanks,
Daniel Healy