<div dir="ltr">Thanks Paul for taking the time to further look into this. In fact you are correct and adding a default mode (which is then overridden by each partition setting) keeps slurm happy with that configuration. Moreover (after restarting daemons, etc per the documentation) everything seems to be working as I intended. I obviously need to do a few more tests, especially for edge cases, but adding that default seems to have completely fixed the problem.<div><br></div><div>Thanks again and have a great weekend!<br><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jan 12, 2024 at 8:49 AM Paul Edmon <<a href="mailto:pedmon@cfa.harvard.edu">pedmon@cfa.harvard.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>My concern was you config inadvertantly having that line
commented out and then seeing problems. If it wasn't then no
worries at this point.</p>
<p>We run using preempt/partition_prio on our cluster and have a mix
of partitions using PreemptMode=OFF and PreemptMode=REQUEUE. So I
know that combination works. I would be surprised if
PreemptMode=CANCEL did not work as that's a valid option.</p>
<p>Something we do have set though is what the default mode is. We
have set:</p>
<p>### Govern's default preemption behavior<br>
PreemptType=preempt/partition_prio<br>
PreemptMode=REQUEUE</p>
<p>So you might try setting that default of PreemptMode=CANCEL and
then set specific PreemptModes for all your partitions. That's
what we do and it works for us.</p>
<p>-Paul Edmon-<br>
</p>
<div>On 1/12/2024 10:33 AM, Davide DelVento
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Thanks Paul,
<div><br>
</div>
<div>I don't understand what you mean by having a typo
somewhere. I mean, that configuration works just fine right
now, whereas if I add the commented out line any slurm command
will just abort with the error "PreemptType and PreemptMode
values incompatible". So, assuming there is a typo, it should
be in the commented line right? Or are you saying that having
that line makes slurm sensitive to a typo somewhere else that
would be otherwise ignored? Obviously I can't exclude that
option, but it seems unlikely to me. Also because it does say
these two things are incompatible. </div>
<div><br>
</div>
<div>It would obviously much better if the error would say what
EXACTLY is incompatible with what, but the documentation at <a href="https://slurm.schedmd.com/preempt.html" target="_blank">https://slurm.schedmd.com/preempt.html</a>
I see many clues of what that could be, and hence I am asking
people here who may have deployed preemption already on their
system. Some excerpts from that URL:</div>
<div><br>
</div>
<div><br>
</div>
<div><b style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptType</b><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px">:
Specifies the plugin used to identify which jobs can be
preempted in order to start a pending job.</span>
<ul style="box-sizing:border-box;margin:0px 0px 0px 1.5em;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:1.5em;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;list-style-position:initial;color:rgb(70,84,92)">
<li style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline"><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant:inherit;font-weight:inherit;font-stretch:inherit;font-size:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline">preempt/none</i>:
Job preemption is disabled (default).</li>
<li style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline"><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant:inherit;font-weight:inherit;font-stretch:inherit;font-size:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline">preempt/partition_prio</i>:
Job preemption is based upon partition <i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant:inherit;font-weight:inherit;font-stretch:inherit;font-size:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline">PriorityTier</i>.
Jobs in higher PriorityTier partitions may preempt jobs
from lower PriorityTier partitions. This is not compatible
with <i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant:inherit;font-weight:inherit;font-stretch:inherit;font-size:inherit;line-height:inherit;font-family:inherit;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline">PreemptMode=OFF</i>.</li>
</ul>
<div><font face="Source Sans Pro, Helvetica, Arial, sans-serif" color="#46545c"><span style="font-size:20px"><br>
</span></font></div>
</div>
which somewhat make it sounds like all partitions should have
preemption set and not only some? I obviously have some "off"
partitions. However elsewhere in that document it says
<div><font face="Source Sans Pro, Helvetica, Arial, sans-serif" color="#46545c"><span style="font-size:20px"><br>
</span></font></div>
<div><b style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptMode</b><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px">:
Mechanism used to preempt jobs or enable gang scheduling.
When the </span><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptType</i><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px"> parameter
is set to enable preemption, the </span><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptMode</i><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px"> in
the main section of slurm.conf selects the default mechanism
used to preempt the preemptable jobs for the cluster.</span><br style="box-sizing:border-box;color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px">
<i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptMode</i><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px"> may
be specified on a per partition basis to override this
default value if </span><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptType=preempt/partition_prio</i><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px">.</span><br>
</div>
<div><br>
</div>
<div>which kind of sounds like it should be okay (unless it
means **everything** must be different than OFF). Yet still
elsewhere in that same page it says</div>
<div><br>
</div>
<div><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px">On
the other hand, if you want to use </span><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptType=preempt/partition_prio</i><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px"> to
allow jobs from higher PriorityTier partitions to Suspend
jobs from lower PriorityTier partitions, then you will need
overlapping partitions, and </span><i style="box-sizing:border-box;margin:0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-variant-alternates:inherit;font-stretch:inherit;font-size:20px;line-height:inherit;font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-kerning:inherit;font-feature-settings:inherit;vertical-align:baseline;color:rgb(70,84,92)">PreemptMode=SUSPEND,GANG</i><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px"> to
use Gang scheduler to resume the suspended job(s). In either
case, time-slicing won't happen between jobs on different
partitions.</span><br>
</div>
<div><span style="color:rgb(70,84,92);font-family:"Source Sans Pro",Helvetica,Arial,sans-serif;font-size:20px"><br>
</span></div>
Which somewhat sounds like only suspend and gang can be used as
preemption modes, and not cancel (my preference) or requeue
(perhaps acceptable, if I jump through some hoops).
<div><br>
</div>
<div>So to me the documentation is highly confusing about what
can or cannot be used together with what else, and the
examples at the bottom of the page are nice, but they do not
specify the full settings. Particularly this one <a href="https://slurm.schedmd.com/preempt.html#example2" target="_blank">https://slurm.schedmd.com/preempt.html#example2</a>
is close enough to mine, but it does not tell what PreemptType
has been chosen (nor if "cancel" would be allowed or not in
that setup).</div>
<div><br>
</div>
<div>Thanks again!</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Jan 12, 2024 at
7:22 AM Paul Edmon <<a href="mailto:pedmon@cfa.harvard.edu" target="_blank">pedmon@cfa.harvard.edu</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>At least in the example you are showing you have
PreemptType commented out, which means it will return the
default. PreemptMode Cancel should work, I don't see
anything in the documentation that indicates it wouldn't.
So I suspect you have a typo somewhere in your conf.</p>
<p>-Paul Edmon-<br>
</p>
<div>On 1/11/2024 6:01 PM, Davide DelVento wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">I would like to add a preemptable queue to
our cluster. Actually I already have. We simply want
jobs submitted to that queue be preempted if there are
no resources available for jobs in other (high priority)
queues. Conceptually very simple, no conditionals, no
choices, just what I wrote.
<div>However it does not work as desired.
<div><br>
This is the relevant part:</div>
<div><br>
</div>
<div><span style="font-size:9pt;font-family:Arial,sans-serif;color:rgb(51,51,51)">grep
-i Preemp /opt/slurm/slurm.conf <br>
#PreemptType = preempt/partition_prio <br>
PartitionName=regular DefMemPerCPU=4580
Default=True Nodes=node[01-12] State=UP
PreemptMode=off PriorityTier=200 <br>
PartitionName=All DefMemPerCPU=4580
Nodes=node[01-36] State=UP PreemptMode=off
PriorityTier=500 <br>
PartitionName=lowpriority DefMemPerCPU=4580
Nodes=node[01-36] State=UP PreemptMode=cancel
PriorityTier=100 <br>
<br>
<br>
</span>That PreemptType setting (now commented)
fully breaks slurm, everything refuses to run with
errors like<span style="font-size:9pt;font-family:Arial,sans-serif;color:rgb(51,51,51)"><br>
<br>
$ squeue <br>
squeue: error: PreemptType and PreemptMode values
incompatible <br>
squeue: fatal: Unable to process configuration
file <br>
<br>
</span></div>
If I understand correctly the documentation at <a href="https://slurm.schedmd.com/preempt.html" target="_blank">https://slurm.schedmd.com/preempt.html</a>
that is because preemption cannot cancel jobs based on
partition priority, which (if true) is really
unfortunate. I understand that allowing
cross-partition time-slicing could be tricky and so I
understand why that isn't allowed, but cancelling?
Anyway, I have to questions:
<div><br>
</div>
<div>1) is that correct and so should I avoid using
either partition priority or cancelling?</div>
<div>2) is there an easy way to trick slurm into
requeing and then have those jobs cancelled instead?</div>
<div>3) I guess the cleanest option would be to
implement QoS, but I've never done it and we don't
really need it for anything else other than this.
The documentation looks complicated, but is it? The
great Ole's website is unavailable at the moment...</div>
</div>
<div><br>
</div>
<div>Thanks!!</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote></div>