[slurm-users] do oversubscription with algorithm other than least-loaded?

Herc Silverstein herc.silverstein at schrodinger.com
Thu Feb 24 21:35:30 UTC 2022


Hi,

We would like to do over-subscription on a cluster that's running in the 
cloud.  The cluster dynamically spins up and down cpu nodes as needed.  
What we see is that the least-loaded algorithm causes the maximum number 
of nodes specified in the partition to be spun up and each loaded with N 
jobs for the N cpu's in a node before it "doubles back" and starts 
over-subscribing.

What we actually want is for the /minimum /number of nodes to be used 
and for it to fully load (to the limit of the oversubscription setting) 
one node before starting up another. That is, we really want a 
"most-loaded" algorithm.  This would allow us to reduce the number of 
nodes we need to run and reduce costs.

Is there a way to get this behavior somehow?

Herc


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220224/621bfc23/attachment.htm>


More information about the slurm-users mailing list