[slurm-users] preemptable queue

Paul Edmon pedmon at cfa.harvard.edu
Fri Jan 12 15:47:11 UTC 2024


My concern was you config inadvertantly having that line commented out 
and then seeing problems. If it wasn't then no worries at this point.

We run using preempt/partition_prio on our cluster and have a mix of 
partitions using PreemptMode=OFF and PreemptMode=REQUEUE. So I know that 
combination works. I would be surprised if PreemptMode=CANCEL did not 
work as that's a valid option.

Something we do have set though is what the default mode is. We have set:

### Govern's default preemption behavior
PreemptType=preempt/partition_prio
PreemptMode=REQUEUE

So you might try setting that default of PreemptMode=CANCEL and then set 
specific PreemptModes for all your partitions. That's what we do and it 
works for us.

-Paul Edmon-

On 1/12/2024 10:33 AM, Davide DelVento wrote:
> Thanks Paul,
>
> I don't understand what you mean by having a typo somewhere. I mean, 
> that configuration works just fine right now, whereas if I add the 
> commented out line any slurm command will just abort with the error 
> "PreemptType and PreemptMode values incompatible". So, assuming there 
> is a typo, it should be in the commented line right? Or are you saying 
> that having that line makes slurm sensitive to a typo somewhere else 
> that would be otherwise ignored? Obviously I can't exclude that 
> option, but it seems unlikely to me. Also because it does say these 
> two things are incompatible.
>
> It would obviously much better if the error would say what EXACTLY is 
> incompatible with what, but the documentation at 
> https://slurm.schedmd.com/preempt.html I see many clues of what that 
> could be, and hence I am asking people here who may have deployed 
> preemption already on their system. Some excerpts from that URL:
>
>
> *PreemptType*: Specifies the plugin used to identify which jobs can be 
> preempted in order to start a pending job.
>
>   * /preempt/none/: Job preemption is disabled (default).
>   * /preempt/partition_prio/: Job preemption is based upon partition
>     /PriorityTier/. Jobs in higher PriorityTier partitions may preempt
>     jobs from lower PriorityTier partitions. This is not compatible
>     with /PreemptMode=OFF/.
>
>
> which somewhat make it sounds like all partitions should have 
> preemption set and not only some? I obviously have some "off" 
> partitions. However elsewhere in that document it says
>
> *PreemptMode*: Mechanism used to preempt jobs or enable gang 
> scheduling. When the /PreemptType/ parameter is set to enable 
> preemption, the /PreemptMode/ in the main section of slurm.conf 
> selects the default mechanism used to preempt the preemptable jobs for 
> the cluster.
> /PreemptMode/ may be specified on a per partition basis to override 
> this default value if /PreemptType=preempt/partition_prio/.
>
> which kind of sounds like it should be okay (unless it means 
> **everything** must be different than OFF). Yet still elsewhere in 
> that same page it says
>
> On the other hand, if you want to use 
> /PreemptType=preempt/partition_prio/ to allow jobs from higher 
> PriorityTier partitions to Suspend jobs from lower PriorityTier 
> partitions, then you will need overlapping partitions, and 
> /PreemptMode=SUSPEND,GANG/ to use Gang scheduler to resume the 
> suspended job(s). In either case, time-slicing won't happen between 
> jobs on different partitions.
>
> Which somewhat sounds like only suspend and gang can be used as 
> preemption modes, and not cancel (my preference) or requeue (perhaps 
> acceptable, if I jump through some hoops).
>
> So to me the documentation is highly confusing about what can or 
> cannot be used together with what else, and the examples at the bottom 
> of the page are nice, but they do not specify the full settings. 
> Particularly this one https://slurm.schedmd.com/preempt.html#example2 
> is close enough to mine, but it does not tell what PreemptType has 
> been chosen (nor if "cancel" would be allowed or not in that setup).
>
> Thanks again!
>
> On Fri, Jan 12, 2024 at 7:22 AM Paul Edmon <pedmon at cfa.harvard.edu> wrote:
>
>     At least in the example you are showing you have PreemptType
>     commented out, which means it will return the default. PreemptMode
>     Cancel should work, I don't see anything in the documentation that
>     indicates it wouldn't. So I suspect you have a typo somewhere in
>     your conf.
>
>     -Paul Edmon-
>
>     On 1/11/2024 6:01 PM, Davide DelVento wrote:
>>     I would like to add a preemptable queue to our cluster. Actually
>>     I already have. We simply want jobs submitted to that queue be
>>     preempted if there are no resources available for jobs in other
>>     (high priority) queues. Conceptually very simple, no
>>     conditionals, no choices, just what I wrote.
>>     However it does not work as desired.
>>
>>     This is the relevant part:
>>
>>     grep -i Preemp /opt/slurm/slurm.conf
>>     #PreemptType = preempt/partition_prio
>>     PartitionName=regular DefMemPerCPU=4580 Default=True
>>     Nodes=node[01-12] State=UP PreemptMode=off PriorityTier=200
>>     PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP
>>     PreemptMode=off PriorityTier=500
>>     PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36]
>>     State=UP PreemptMode=cancel PriorityTier=100
>>
>>
>>     That PreemptType setting (now commented) fully breaks slurm,
>>     everything refuses to run with errors like
>>
>>     $ squeue
>>     squeue: error: PreemptType and PreemptMode values incompatible
>>     squeue: fatal: Unable to process configuration file
>>
>>     If I understand correctly the documentation at
>>     https://slurm.schedmd.com/preempt.html that is because preemption
>>     cannot cancel jobs based on partition priority, which (if true)
>>     is really unfortunate. I understand that allowing
>>     cross-partition time-slicing could be tricky and so I understand
>>     why that isn't allowed, but cancelling? Anyway, I have to questions:
>>
>>     1) is that correct and so should I avoid using either partition
>>     priority or cancelling?
>>     2) is there an easy way to trick slurm into requeing and then
>>     have those jobs cancelled instead?
>>     3) I guess the cleanest option would be to implement QoS, but
>>     I've never done it and we don't really need it for anything else
>>     other than this. The documentation looks complicated, but is it?
>>     The great Ole's website is unavailable at the moment...
>>
>>     Thanks!!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20240112/dbb3fd57/attachment.htm>


More information about the slurm-users mailing list