[slurm-users] floating condo partition, , no pre-emption, guarantee a max pend time?

Paul Brunk pbrunk at uga.edu
Wed Apr 22 21:43:39 UTC 2020

Hi all:

[ BTW this is the same situation that the submitter of https://bugs.schedmd.com/show_bug.cgi?id=2692 presented. ]

We have a non-Slurm cluster in production and are developing our next one, which will run Slurm 20.02.X.

We have a partition "batch" which is open to all users.  Half of the nodes are 'ownerless', while some PIs have bought nodes.  In production now, there's a distinct partition for each such PI, and her physical nodes are allocated to her partition only.

But for the Slurm cluster, we want to add the ability to have PIs buy prioritized resource allocations, rather than physical nodes.  If a PI contributed 20 nodes' worth of money (80 cores' worth, let's say), then we want it such that

(a) until either (PI has no small-enough jobs pending) or (PI is using 80
    batch-partition cores), idle batch-partition cores are allocated
    to this PI's jobs first.

(b) until the PI is using 80 batch-partition cores, her pending jobs
    small enough to fit inside the unused-by-this-PI subset of that
    80-core set will have to wait no more than 2 hours, say.

(c) the "batch" partition will have a max runtime longer than the 2hrs
    max pend time stated in the PI's SLA.  Many "batch" jobs are < 2
    hrs though.

(d) we don't pre-empt (since we don't do that here).

Defining a floating partition with GrpCores = 80, allocating it very high priority, and assigning the "batch" partition's cores to it would do much of what we want, but wouldn't have the "within two hours" part, because of the "batch" partition's max runtime.  

Does anyone know of a way to satisfy all of (a)-(d)?

As in the original posting, my thinking has only yielded this:  a floating-through-time 2-hr reservation on N cores would ensure their availability within 2 hrs.  But I'd need to automate somehow the unique availability of such reserved cores to that PI, immediately upon removal of the floating-through-time reservation on them, and also the management of the reservation's node membership.  I don't assume that a good answer resembles that at all.

Thanks for any insights!

Paul Brunk, system administrator
Georgia Advanced Computing Resource Center (GACRC)
Enterprise IT Svcs, the University of Georgia

More information about the slurm-users mailing list