[slurm-users] preemptable queue
Paul Edmon
pedmon at cfa.harvard.edu
Fri Jan 12 15:47:11 UTC 2024
My concern was you config inadvertantly having that line commented out
and then seeing problems. If it wasn't then no worries at this point.
We run using preempt/partition_prio on our cluster and have a mix of
partitions using PreemptMode=OFF and PreemptMode=REQUEUE. So I know that
combination works. I would be surprised if PreemptMode=CANCEL did not
work as that's a valid option.
Something we do have set though is what the default mode is. We have set:
### Govern's default preemption behavior
PreemptType=preempt/partition_prio
PreemptMode=REQUEUE
So you might try setting that default of PreemptMode=CANCEL and then set
specific PreemptModes for all your partitions. That's what we do and it
works for us.
-Paul Edmon-
On 1/12/2024 10:33 AM, Davide DelVento wrote:
> Thanks Paul,
>
> I don't understand what you mean by having a typo somewhere. I mean,
> that configuration works just fine right now, whereas if I add the
> commented out line any slurm command will just abort with the error
> "PreemptType and PreemptMode values incompatible". So, assuming there
> is a typo, it should be in the commented line right? Or are you saying
> that having that line makes slurm sensitive to a typo somewhere else
> that would be otherwise ignored? Obviously I can't exclude that
> option, but it seems unlikely to me. Also because it does say these
> two things are incompatible.
>
> It would obviously much better if the error would say what EXACTLY is
> incompatible with what, but the documentation at
> https://slurm.schedmd.com/preempt.html I see many clues of what that
> could be, and hence I am asking people here who may have deployed
> preemption already on their system. Some excerpts from that URL:
>
>
> *PreemptType*: Specifies the plugin used to identify which jobs can be
> preempted in order to start a pending job.
>
> * /preempt/none/: Job preemption is disabled (default).
> * /preempt/partition_prio/: Job preemption is based upon partition
> /PriorityTier/. Jobs in higher PriorityTier partitions may preempt
> jobs from lower PriorityTier partitions. This is not compatible
> with /PreemptMode=OFF/.
>
>
> which somewhat make it sounds like all partitions should have
> preemption set and not only some? I obviously have some "off"
> partitions. However elsewhere in that document it says
>
> *PreemptMode*: Mechanism used to preempt jobs or enable gang
> scheduling. When the /PreemptType/ parameter is set to enable
> preemption, the /PreemptMode/ in the main section of slurm.conf
> selects the default mechanism used to preempt the preemptable jobs for
> the cluster.
> /PreemptMode/ may be specified on a per partition basis to override
> this default value if /PreemptType=preempt/partition_prio/.
>
> which kind of sounds like it should be okay (unless it means
> **everything** must be different than OFF). Yet still elsewhere in
> that same page it says
>
> On the other hand, if you want to use
> /PreemptType=preempt/partition_prio/ to allow jobs from higher
> PriorityTier partitions to Suspend jobs from lower PriorityTier
> partitions, then you will need overlapping partitions, and
> /PreemptMode=SUSPEND,GANG/ to use Gang scheduler to resume the
> suspended job(s). In either case, time-slicing won't happen between
> jobs on different partitions.
>
> Which somewhat sounds like only suspend and gang can be used as
> preemption modes, and not cancel (my preference) or requeue (perhaps
> acceptable, if I jump through some hoops).
>
> So to me the documentation is highly confusing about what can or
> cannot be used together with what else, and the examples at the bottom
> of the page are nice, but they do not specify the full settings.
> Particularly this one https://slurm.schedmd.com/preempt.html#example2
> is close enough to mine, but it does not tell what PreemptType has
> been chosen (nor if "cancel" would be allowed or not in that setup).
>
> Thanks again!
>
> On Fri, Jan 12, 2024 at 7:22 AM Paul Edmon <pedmon at cfa.harvard.edu> wrote:
>
> At least in the example you are showing you have PreemptType
> commented out, which means it will return the default. PreemptMode
> Cancel should work, I don't see anything in the documentation that
> indicates it wouldn't. So I suspect you have a typo somewhere in
> your conf.
>
> -Paul Edmon-
>
> On 1/11/2024 6:01 PM, Davide DelVento wrote:
>> I would like to add a preemptable queue to our cluster. Actually
>> I already have. We simply want jobs submitted to that queue be
>> preempted if there are no resources available for jobs in other
>> (high priority) queues. Conceptually very simple, no
>> conditionals, no choices, just what I wrote.
>> However it does not work as desired.
>>
>> This is the relevant part:
>>
>> grep -i Preemp /opt/slurm/slurm.conf
>> #PreemptType = preempt/partition_prio
>> PartitionName=regular DefMemPerCPU=4580 Default=True
>> Nodes=node[01-12] State=UP PreemptMode=off PriorityTier=200
>> PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP
>> PreemptMode=off PriorityTier=500
>> PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36]
>> State=UP PreemptMode=cancel PriorityTier=100
>>
>>
>> That PreemptType setting (now commented) fully breaks slurm,
>> everything refuses to run with errors like
>>
>> $ squeue
>> squeue: error: PreemptType and PreemptMode values incompatible
>> squeue: fatal: Unable to process configuration file
>>
>> If I understand correctly the documentation at
>> https://slurm.schedmd.com/preempt.html that is because preemption
>> cannot cancel jobs based on partition priority, which (if true)
>> is really unfortunate. I understand that allowing
>> cross-partition time-slicing could be tricky and so I understand
>> why that isn't allowed, but cancelling? Anyway, I have to questions:
>>
>> 1) is that correct and so should I avoid using either partition
>> priority or cancelling?
>> 2) is there an easy way to trick slurm into requeing and then
>> have those jobs cancelled instead?
>> 3) I guess the cleanest option would be to implement QoS, but
>> I've never done it and we don't really need it for anything else
>> other than this. The documentation looks complicated, but is it?
>> The great Ole's website is unavailable at the moment...
>>
>> Thanks!!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20240112/dbb3fd57/attachment.htm>
More information about the slurm-users
mailing list