[slurm-users] How to assign temporary priority bonuses or penalties?

Luke Yeager lyeager at nvidia.com
Thu Dec 10 17:20:33 UTC 2020


(originally posted at https://bugs.schedmd.com/show_bug.cgi?id=10322)

There are some great tools for assigning discounts or penalties to jobs before they are allocated resources (QOS.UsageFactor, Partition.TRESBillingWeights, etc.).

But what if I want to change the cost of a job after the fact? I might want to avoid penalizing users who spent their allocated resources on jobs which failed due to reasons outside their control (hardware failure, parallel FS glitch, etc.). Or I might want to charge extra for jobs which require node reboots to cleanup afterwards. Either way, I want to be able to adjust how the job affects their current fairshare priority for queued jobs.

Are there any existing solutions for this?

The only solutions I've found so far are:

  1.  'sacctmgr modify ... set RawUsage=0' - obviously this is too big of a hammer. I only want to edit a single job, and I might want to *increase* the usage for the job - not decrease it.
  2.  For clusters using "banking" (limits on TRESMins and PriorityDecayHalfLife=0), you can essentially accomplish this by editing the limit after the fact (increasing the limit for a refund, decreasing it for a penalty). See https://github.com/jcftang/slurm-bank/blob/master/src/sbank-refund, for example. But we don't use that accounting strategy at our site. And that seems a little sketchy anyway since you'd need to remember to reset the limits back to their intended values at each usage reset.

The official answer I got on the bug is "I don't think what you are looking for is possible with Slurm at the moment." I'm posting here in hopes that someone else has a creative solution? How do y'all handle this?

Thanks!
Luke

Search keywords: priority bump refund penalty accounting
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201210/a1497321/attachment.htm>


More information about the slurm-users mailing list