[slurm-users] do oversubscription with algorithm other than least-loaded?
Herc Silverstein
herc.silverstein at schrodinger.com
Thu Feb 24 21:35:30 UTC 2022
Hi,
We would like to do over-subscription on a cluster that's running in the
cloud. The cluster dynamically spins up and down cpu nodes as needed.
What we see is that the least-loaded algorithm causes the maximum number
of nodes specified in the partition to be spun up and each loaded with N
jobs for the N cpu's in a node before it "doubles back" and starts
over-subscribing.
What we actually want is for the /minimum /number of nodes to be used
and for it to fully load (to the limit of the oversubscription setting)
one node before starting up another. That is, we really want a
"most-loaded" algorithm. This would allow us to reduce the number of
nodes we need to run and reduce costs.
Is there a way to get this behavior somehow?
Herc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220224/621bfc23/attachment.htm>
More information about the slurm-users
mailing list