[slurm-users] Increasing job priority based on resources requested.
pbisbal at pppl.gov
Fri Apr 19 18:11:44 UTC 2019
Thanks for the response. I am already doing that with weights for the
nodes, but I was hoping to go one step further. Personally, I think
using weights like this should be an acceptable approach, but during a
user discussion, users, wanted the large memory jobs to "go to the head
of the line" for the large mem systems. In the past, we have had issues
with small jobs using up the large memory nodes and blocking jobs that
truly need that amount of memory, so there is a basis for this request.
In practice, if I can't make this happen, I don't think users would
notice that much, since this would only be an issue if our cluster is
100% full, our clusters averages about 65% utilization irght now.
On 4/19/19 11:39 AM, Chris M. Miller wrote:
> I don't have a good answer to your original question, but I'll note I have a similar concern and solved it a different way. What we did was use lower weights in the node definitions for the "smaller" (less feature-rich) nodes, and extra high weights for nodes with unique features (like GPUs, in our case). In this way, jobs are scheduled to the smallest available nodes they fit in, and the larger or more feature-rich nodes have a kind of soft reservation either for large jobs or for busy times.
> ----- Original Message -----
> From: "Prentice Bisbal" <pbisbal at pppl.gov>
> To: slurm-users at lists.schedmd.com
> Sent: Friday, April 19, 2019 11:27:08 AM
> Subject: Re: [slurm-users] Increasing job priority based on resources requested.
> I certainly understand your point of view, but yes, this is definitely
> what I want. We only have a few large memory nodes, so we want jobs that
> request a lot of memory to have higher priority so they get assigned to
> those large memory nodes ahead of lower-memory jobs which could run
> anywhere else. But we don't want those nodes to sit idle if there's jobs
> in the queue that need that much memory. Similar idea for IB - nodes
> that need IB should get priority over nodes that don't
> Ideally, I wouldn't have such a heterogeneous environment, and then this
> wouldn't be needed at all.
> I agree this opens another avenue for unscrupulous users to game the
> system, but that (in theory) can be policed by looked at memory
> requested vs. memory used in the accounting data to identify any abusers
> and then give them a stern talking to.
> On 4/18/19 5:27 PM, Ryan Novosielski wrote:
>> This is not an official answer really, but I’ve always just considered this to be the way that the scheduler works. It wants to get work completed, so it will have a bias toward doing what is possible vs. not (can’t use 239GB of RAM on a 128GB node). And really, is a higher priority what you want? I’m not so sure. How soon will someone figure out that they might get a higher priority based on requesting some feature they don’t need?
>> || \\UTGERS, |---------------------------*O*---------------------------
>> ||_// the State | Ryan Novosielski - novosirj at rutgers.edu
>> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
>> || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
>>> On Apr 18, 2019, at 5:20 PM, Prentice Bisbal <pbisbal at pppl.gov> wrote:
>>> Is there away to increase a jobs priority based on the resources or constraints it has requested?
>>> For example, we have a very heterogeneous cluster here: Some nodes only have 1 Gb Ethernet, some have 10 Gb Ethernet, and others have DDR IB. In addition, we have some large memory nodes with RAM amounts ranging from 128 GB up to 512 GB. To allow a user to request IB, I have implemented that as a feature in the node definition so users can request that as a constraint.
>>> I would like to make it that if a job request IB, it's priority will go up, or if it requests a lot of memory (specifically memory-per-cpu), it's priority will go up proportionately to the amount of memory requested. Is this possible? If so, how?
>>> I have tried going through the documentation, and googling, but 'priority' is used to discuss job priority so much, I couldn't find any search results relevant to this.
More information about the slurm-users