[slurm-users] Spreading jobs across servers instead of loading up individual nodes

Aravindh Sampathkumar aravindh at fastmail.com
Thu Nov 15 02:25:00 MST 2018


Hi All.

I'm having some trouble finding appropriate section of the documentation
to change slurm resource allocation policy.
We have configured CPU and memory as consumable resources, and our nodes
can run multiple jobs as long as there are CPU memory available.
What I want is for Slurm to spread jobs across all available servers in
a partition instead of loading up few servers while others are idling.
For example, I have a partition nav which has 5 compute nodes(node[1-5])
dedicated to it.when users submit 3 jobs to nav partition, each requesting 1 CPU core
and 1 GB of memory, SLURM schedules all the jobs in node1 because it
has enough CPU cores and memory to satisfy job requirements. nodes -
2,3,4,5 are idle.
What I want instead is for slurm to schedule job1 to node1, job2 to
node2, job3 to node3.. and then in the future if there are more jobs
than there are nodes, slurm must utilise the rest of resources
available in node1.

Why? 
A small group that is using this partition is concerned that all their
jobs get scheduled on the same node, and they  need to share network
bandwidth, and bandwidth to local disk. If they were spread out instead,
they could use better bandwidth.
Appreciate any advice how I can make this happen. 

Thanks,
  Aravindh Sampathkumar
  aravindh at fastmail.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181115/b9f6e97b/attachment.html>


More information about the slurm-users mailing list