[slurm-users] Controlling access to idle nodes

Thomas M. Payerle payerle at umd.edu
Tue Oct 6 17:50:41 UTC 2020


We use a scavenger partition, and although we do not have the policy you
describe, it could be used in your case.

Assume you have 6 nodes (node-[0-5]) and two groups A and B.
Create partitions
partA = node-[0-2]
partB = node-[3-5]
all = node-[0-6]

Create QoSes normal and scavenger.
Allow normal QoS to preempt jobs with scavenger QoS

In sacctmgr, give members of group A access to use partA with normal QoS
and group B access to use partB with normal QoS
Allow both A and B to use part all with scavenger QoS.

So members of A can launch jobs on partA with normal QoS (probably want to
make that their default), and similarly member of B can launch jobs on
partB with normal QoS.
But membes of A can also launch jobs on partB with scavenger QoS and vica
versa.  If the partB nodes used by A are needed by B, they will get
preempted.

This is not automatic (users need to explicitly say they want to run jobs
on the other half of the cluster), but that is probably reasonable because
there are some jobs one does not wish to get preempted even if they have to
wait a while in the queue to ensure such.

On Tue, Oct 6, 2020 at 11:12 AM David Baker <D.J.Baker at soton.ac.uk> wrote:

> Hello,
>
> I would appreciate your advice on how to deal with this situation in
> Slurm, please. If I have a set of nodes used by 2 groups, and normally each
> group would each have access to half the nodes. So, I could limit each
> group to have access to 3 nodes each, for example. I am trying to devise a
> scheme that allows each group to make best use of the node always. In other
> words, each group could potentially use all the nodes (assuming they all
> free and the other group isn't using the nodes at all).
>
> I cannot set hard and soft limits in slurm, and so I'm not sure how to
> make the situation flexible. Ideally It would be good for each group to be
> able to use their allocation and then take advantage of any idle nodes via
> a scavenging mechanism. The other group could then pre-empt the scavenger
> jobs and claim their nodes. I'm struggling with this since this seems like
> a two-way scavenger situation.
>
> Could anyone please help? I have, by the way, set up partition-based
> pre-emption in the cluster. This allows the general public to scavenge
> nodes owned by research groups.
>
> Best regards,
> David
>
>
>

-- 
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroads        payerle at umd.edu
5825 University Research Park               (301) 405-6135
University of Maryland
College Park, MD 20740-3831
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201006/d26eaa31/attachment.htm>


More information about the slurm-users mailing list