[slurm-users] spreading jobs out across the cluster

Loris Bennett loris.bennett at fu-berlin.de
Wed Jun 14 11:44:35 UTC 2023


Hi Stephen,

"Stephen Berg, Code 7309" <stephen.berg at nrlssc.navy.mil> writes:

> I'm currently testing a new slurm setup before converting an existing
> pbs/torque grid over.  Right now I've got 8 nodes in one partition, 48 
> cores on each.  There's a second partition of older systems configured
> as 4 core nodes so the users can run some serial jobs.
>
> During some testing I've noticed that jobs always seem to take the
> nodes in a top down fashion.  If I queue up a bunch of 3 node jobs
> they take nodes 1, 2 and 3 for one job, 4,5 and 6 for another. Nodes 7
> and 8 never get used.  I'd like to have slurm spread the jobs out
> across the nodes in a round robin fashion or even randomly.  My config
> is really basic right now, I'm using defaults for most everything.
>
> Which settings could get the jobs spread out across the nodes in each
> partition a bit more fairly?

You can set 

  LLN 

for "least loaded nodes" in the configuration of the partition (see 'man
slurm.conf')

However, this is often not what you want.  If you maximise the number
of nodes in use, you won't be able to save energy by powering down nodes
which are not required.  What is your use-case for wanting to spread the
jobs out?

Cheers,

Loris

-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin



More information about the slurm-users mailing list