[slurm-users] Priority access for a group of users

Tue Feb 19 14:12:45 UTC 2019

I just set this up a couple of weeks ago myself. Creating two partitions 
is definitely the way to go. I created one partition, "general" for 
normal, general-access jobs, and another, "interruptible" for 
general-access jobs that can be interrupted, and then set PriorityTier 
accordingly in my slurm.conf file (Node names omitted for clarity/brevity).

PartitionName=general Nodes=... MaxTime=48:00:00 State=Up 
PriorityTier=10 QOS=general
PartitionName=interruptible Nodes=... MaxTime=48:00:00 State=Up 
PriorityTier=1 QOS=interruptible

I then set PreemptMode=Requeue, because I'd rather have jobs requeued 
than suspended. And it's been working great. There are few other 
settings I had to change. The best documentation for all the settings 
you need to change is https://slurm.schedmd.com/preempt.html

Everything has been working exactly as desired and advertised. My users 
who needed the ability to run low-priority, long-running jobs are very 
happy.

The one caveat is that jobs that will be killed and requeued need to 
support checkpoint/restart. So when this becomes a production thing, 
users are going to have to acknowledge that they will only use this 
partition for jobs that have some sort of checkpoint/restart capability.

Prentice

On 2/15/19 11:56 AM, david baker wrote:
> Hi Paul, Marcus,
>
> Thank you for your replies. Using partition priority all makes sense. 
> I was thinking of doing something similar with a set of nodes 
> purchased by another group. That is, having a private high priority 
> partition and a lower priority "scavenger" partition for the public. 
> In this case scavenger jobs will get killed when preempted.
>
> In the present case , I did wonder if it would be possible to do 
> something with just a single partition -- hence my question.Your 
> replies have convinced me that two partitions will work -- with 
> preemption leading to re-queued jobs.
>
> Best regards,
> David
>
> On Fri, Feb 15, 2019 at 3:09 PM Paul Edmon <pedmon at cfa.harvard.edu 
> <mailto:pedmon at cfa.harvard.edu>> wrote:
>
>     Yup, PriorityTier is what we use to do exactly that here.  That
>     said unless you turn on preemption jobs may still pend if there is
>     no space.  We run with REQUEUE on which has worked well.
>
>
>     -Paul Edmon-
>
>
>     On 2/15/19 7:19 AM, Marcus Wagner wrote:
>>     Hi David,
>>
>>     as far as I know, you can use the PriorityTier (partition
>>     parameter) to achieve this. According to the manpages (if I
>>     remember right) jobs from higher priority tier partitions have
>>     precedence over jobs from lower priority tier partitions, without
>>     taking the normal fairshare priority into consideration.
>>
>>     Best
>>     Marcus
>>
>>     On 2/15/19 10:07 AM, David Baker wrote:
>>>
>>>     Hello.
>>>
>>>
>>>     We have a small set of compute nodes owned by a group. The group
>>>     has agreed that the rest of the HPC community can use these
>>>     nodes providing that they (the owners) can always have priority
>>>     access to the nodes. The four nodes are well provisioned (1
>>>     TByte memory each plus 2 GRID K2 graphics cards) and so there is
>>>     no need to worry about preemption. In fact I'm happy for the
>>>     nodes to be used as well as possible by all users. It's just
>>>     that jobs from the owners must take priority if resources are
>>>     scarce.
>>>
>>>
>>>     What is the best way to achieve the above in slurm? I'm planning
>>>     to place the nodes in their own partition. The node owners will
>>>     have priority access to the nodes in that partition, but will
>>>     have no advantage when submitting jobs to the public resources.
>>>     Does anyone please have any ideas how to deal with this?
>>>
>>>
>>>     Best regards,
>>>
>>>     David
>>>
>>>
>>
>>     -- 
>>     Marcus Wagner, Dipl.-Inf.
>>
>>     IT Center
>>     Abteilung: Systeme und Betrieb
>>     RWTH Aachen University
>>     Seffenter Weg 23
>>     52074 Aachen
>>     Tel: +49 241 80-24383
>>     Fax: +49 241 80-624383
>>     wagner at itc.rwth-aachen.de  <mailto:wagner at itc.rwth-aachen.de>
>>     www.itc.rwth-aachen.de  <http://www.itc.rwth-aachen.de>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190219/0d21f30d/attachment.html>