<div dir="ltr">I could have swore I had tested this before implementing it and it worked as expected.<div><br></div><div>If I am dreaming that testing - is there a way of allowing preemption across partitions? </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Aug 20, 2021 at 8:40 AM Brian Andrus <<a href="mailto:toomuchit@gmail.com">toomuchit@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <p>IIRC, Preemption is determined by partition first, not node.</p>
    <p>Since your pending job is in the 'day' partition, it will not
      preempt something in the 'night' partition (even if the node is in
      both).</p>
    <p>Brian Andrus<br>
    </p>
    <div>On 8/19/2021 2:49 PM, Russell Jones
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">Hi all,
        <div><br>
          I could use some help to understand why preemption is not
          working for me properly. I have a job blocking other jobs that
          doesn't make sense to me. Any assistance is appreciated,
          thank you!</div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>I have two partitions defined in slurm, a day time and a
          night time pariition:<br>
          <br>
        </div>
        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
          <div>Day partition - PriorityTier of 5, always Up.
            Limited resources under this QOS.</div>
          <div>Night partition - PriorityTier of 5 during night time,
            during day time set to Down and PriorityTier changed to 1.
            Jobs can be submitted to night queue for an unlimited QOS as
            long as resources are available. <br>
            <br>
            The thought here is jobs can continue to run in the night
            partition, even during the day time, until resources are
            requested from the day partition. Jobs would then be
            requeued/canceled in the night partition to satisfy those
            requirements.</div>
        </blockquote>
        <div><br>
          <br>
          Current output of "scontrol show part" :<br>
          <br>
        </div>
        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
          <div>PartitionName=day</div>
          <div>   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL</div>
          <div>   AllocNodes=ALL Default=NO QoS=part_day</div>
          <div>   DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO
            GraceTime=0 Hidden=NO</div>
          <div>   MaxNodes=UNLIMITED MaxTime=1-00:00:00 MinNodes=0
            LLN=NO MaxCPUsPerNode=UNLIMITED</div>
          <div>   Nodes=cluster-r1n[01-13],cluster-r2n[01-08]</div>
          <div>   PriorityJobFactor=1 PriorityTier=5 RootOnly=NO
            ReqResv=NO OverSubscribe=NO</div>
          <div>   OverTimeLimit=NONE PreemptMode=REQUEUE</div>
          <div>   State=UP TotalCPUs=336 TotalNodes=21
            SelectTypeParameters=NONE</div>
          <div>   JobDefaults=(null)</div>
          <div>   DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED</div>
        </blockquote>
        <div><br>
        </div>
        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
          <div>PartitionName=night</div>
          <div>   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL</div>
          <div>   AllocNodes=ALL Default=NO QoS=part_night</div>
          <div>   DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO
            GraceTime=0 Hidden=NO</div>
          <div>   MaxNodes=22 MaxTime=7-00:00:00 MinNodes=0 LLN=NO
            MaxCPUsPerNode=UNLIMITED</div>
          <div>   Nodes=cluster-r1n[01-13],cluster-r2n[01-08]</div>
          <div>   PriorityJobFactor=1 PriorityTier=1 RootOnly=NO
            ReqResv=NO OverSubscribe=NO</div>
          <div>   OverTimeLimit=NONE PreemptMode=REQUEUE</div>
          <div>   State=DOWN TotalCPUs=336 TotalNodes=21
            SelectTypeParameters=NONE</div>
          <div>   JobDefaults=(null)</div>
          <div>   DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED</div>
        </blockquote>
        <div><br>
          <br>
          <br>
          I currently have a job in the night partition that is blocking
          jobs in the day partition, even though the day partition has a
          PriorityTier of 5, and night partition is Down with a
          PriorityTier of 1.<br>
          <br>
          My current slurm.conf preemption settings are:<br>
          <br>
        </div>
        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
          <div>PreemptMode=REQUEUE</div>
          <div>PreemptType=preempt/partition_prio</div>
        </blockquote>
        <div><br>
          <br>
          The blocking job's scontrol show job output is:<br>
          <br>
        </div>
        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
          <div>JobId=105713 JobName=jobname</div>
          <div>   Priority=1986 Nice=0 Account=xxx QOS=normal</div>
          <div>   JobState=RUNNING Reason=None Dependency=(null)</div>
          <div>   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0</div>
          <div>   RunTime=17:49:39 TimeLimit=7-00:00:00 TimeMin=N/A</div>
          <div>   SubmitTime=2021-08-18T22:36:36
            EligibleTime=2021-08-18T22:36:36</div>
          <div>   AccrueTime=2021-08-18T22:36:36</div>
          <div>   StartTime=2021-08-18T22:36:39
            EndTime=2021-08-25T22:36:39 Deadline=N/A</div>
          <div>   PreemptEligibleTime=2021-08-18T22:36:39
            PreemptTime=None</div>
          <div>   SuspendTime=None SecsPreSuspend=0
            LastSchedEval=2021-08-18T22:36:39</div>
          <div>   Partition=night AllocNode:Sid=cluster-1:1341505</div>
          <div>   ReqNodeList=(null) ExcNodeList=(null)</div>
          <div>   NodeList=cluster-r1n[12-13],cluster-r2n[04-06]</div>
          <div>   BatchHost=cluster-r1n12</div>
          <div>   NumNodes=5 NumCPUs=80 NumTasks=5 CPUs/Task=1
            ReqB:S:C:T=0:0:*:*</div>
          <div>   TRES=cpu=80,node=5,billing=80,gres/gpu=20</div>
          <div>   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*</div>
          <div>   MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0</div>
          <div>   Features=(null) DelayBoot=00:00:00</div>
          <div>   OverSubscribe=NO Contiguous=0 Licenses=(null)
            Network=(null)</div>
        </blockquote>
        <div><br>
          <br>
          The job that is being blocked:<br>
          <br>
        </div>
        <blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
          <div>JobId=105876 JobName=bash</div>
          <div>   Priority=2103 Nice=0 Account=xxx QOS=normal</div>
          <div>   JobState=PENDING
Reason=Nodes_required_for_job_are_DOWN,_DRAINED_or_reserved_for_jobs_in_higher_priority_partitions
            Dependency=(null)</div>
          <div>   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0</div>
          <div>   RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A</div>
          <div>   SubmitTime=2021-08-19T16:19:23
            EligibleTime=2021-08-19T16:19:23</div>
          <div>   AccrueTime=2021-08-19T16:19:23</div>
          <div>   StartTime=Unknown EndTime=Unknown Deadline=N/A</div>
          <div>   SuspendTime=None SecsPreSuspend=0
            LastSchedEval=2021-08-19T16:26:43</div>
          <div>   Partition=day AllocNode:Sid=cluster-1:2776451</div>
          <div>   ReqNodeList=(null) ExcNodeList=(null)</div>
          <div>   NodeList=(null)</div>
          <div>   NumNodes=3 NumCPUs=40 NumTasks=40 CPUs/Task=1
            ReqB:S:C:T=0:0:*:*</div>
          <div>   TRES=cpu=40,node=1,billing=40</div>
          <div>   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*</div>
          <div>   MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0</div>
          <div>   Features=(null) DelayBoot=00:00:00</div>
          <div>   OverSubscribe=NO Contiguous=0 Licenses=(null)
            Network=(null)</div>
          <div><br>
          </div>
        </blockquote>
        <br>
        <br>
        Why is the day job not preempting the night job? <br>
      </div>
    </blockquote>
  </div>

</blockquote></div>