<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Seems like the time may have been off on the db server at the
      insert/update.</p>
    <p>You may want to dump the database, find what table/records need
      updated and try updating them. If anything went south, you could
      restore from the dump.</p>
    <p>Brian Andrus<br>
    </p>
    <div class="moz-cite-prefix">On 12/20/2022 11:51 AM, Reed Dier
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:A889B640-1F36-4DDE-9603-B366ACCCD1E6@focusvq.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      Just to followup with some things I’ve tried:
      <div class=""><br class="">
      </div>
      <div class="">scancel doesn’t want to touch it:</div>
      <div class="">
        <blockquote type="cite" class="">
          <div class=""><font class="" face="Menlo"># scancel -v 290710</font></div>
          <div class=""><font class="" face="Menlo">scancel: Terminating
              job 290710</font></div>
          <div class=""><font class="" face="Menlo">scancel: error: Kill
              job error on job id 290710: Job/step already completing or
              completed</font></div>
        </blockquote>
        <div><br class="">
        </div>
        <div>pscontrol does see that these are all members of the same
          array, but doesn’t want to touch it:</div>
        <div>
          <blockquote type="cite" class="">
            <div><font class="" face="Menlo"># scontrol update
                JobID=290710 EndTime=2022-08-09T08:47:01</font></div>
            <div><font class="" face="Menlo">290710_4,6,26,32,60,67,83,87,89,91,...:
                Job has already finished</font></div>
          </blockquote>
          <br class="">
        </div>
        <div>And trying to modify the job’s end time with sacctmgr
          fails, as expected, to modify the EndTime because EndTime is
          only a where spec, not a set spec, also tried EndTime=now with
          same results:</div>
        <div>
          <blockquote type="cite" class="">
            <div><font class="" face="Menlo"># sacctmgr modify job where
                JobID=290710 set EndTime=2022-08-09T08:47:01</font></div>
            <div><font class="" face="Menlo"> Unknown option:
                EndTime=2022-08-09T08:47:01</font></div>
            <div><font class="" face="Menlo"> Use keyword 'where' to
                modify condition</font></div>
            <div><font class="" face="Menlo"> You didn't give me
                anything to set</font></div>
          </blockquote>
          <br class="">
        </div>
        <div>I was able to set a comment for the jobs/array, so the DBD
          can see/talk to them.</div>
        <div>One additional thing to mention is that there are 14 JIDs
          that are stuck like this, 1 is an Array JID, and 13 of them
          are array tasks on the original Array ID.</div>
        <div><br class="">
        </div>
        <div>But figured I would provide some of the other steps I’ve
          tried to flush those ideas.</div>
        <div><br class="">
        </div>
        <div>Thanks,</div>
        <div>Reed</div>
        <div><br class="">
          <blockquote type="cite" class="">
            <div class="">On Dec 20, 2022, at 10:08 AM, Reed Dier <<a
                href="mailto:reed.dier@focusvq.com"
                class="moz-txt-link-freetext" moz-do-not-send="true">reed.dier@focusvq.com</a>>
              wrote:</div>
            <br class="Apple-interchange-newline">
            <div class="">
              <meta http-equiv="Content-Type" content="text/html;
                charset=UTF-8" class="">
              <div style="word-wrap: break-word; -webkit-nbsp-mode:
                space; line-break: after-white-space;" class="">2 votes
                for runawayjobs is a strong vote (and also something I’m
                glad to learn exists for the future), however, it does
                not appear to be the case.
                <div class=""><br class="">
                </div>
                <div class="">
                  <blockquote type="cite" class="">
                    <div class=""># sacctmgr show runawayjobs</div>
                    <div class="">Runaway Jobs: No runaway jobs found on
                      cluster $cluster</div>
                  </blockquote>
                  <div class=""><br class="">
                  </div>
                  So unfortunately that doesn’t appear to be the
                  culprit.</div>
                <div class=""><br class="">
                </div>
                <div class="">Appreciate the responses.</div>
                <div class=""><br class="">
                </div>
                <div class="">Reed<br class="">
                  <div class=""><br class="">
                    <blockquote type="cite" class="">
                      <div class="">On Dec 20, 2022, at 10:03 AM, Brian
                        Andrus <<a href="mailto:toomuchit@gmail.com"
                          class="moz-txt-link-freetext"
                          moz-do-not-send="true">toomuchit@gmail.com</a>>
                        wrote:</div>
                      <br class="Apple-interchange-newline">
                      <div class="">
                        <meta http-equiv="Content-Type"
                          content="text/html; charset=UTF-8" class="">
                        <div class="">
                          <p class="">Try: <br class="">
                          </p>
                          <p class="">    sacctmgr list runawayjobs</p>
                          <p class="">Brian Andrus<br class="">
                          </p>
                          <div class="moz-cite-prefix">On 12/20/2022
                            7:54 AM, Reed Dier wrote:<br class="">
                          </div>
                          <blockquote type="cite"
                            cite="mid:069A5B5A-CC57-46B8-9CDE-095CA83D7C83@focusvq.com"
                            class="">
                            <meta http-equiv="Content-Type"
                              content="text/html; charset=UTF-8"
                              class="">
                            Hoping this is a fairly simple one.
                            <div class=""><br class="">
                            </div>
                            <div class="">This is a small internal
                              cluster that we’ve been using for about 6
                              months now, and we’ve had some
                              infrastructure instability in that time,
                              which I think may be the root culprit
                              behind this weirdness, but hopefully
                              someone can point me in the direction to
                              solve the issue.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">I do a daily email of sreport
                              to show how busy the cluster was, and who
                              were the top users.</div>
                            <div class="">Weirdly, I have a user that
                              seems to be able to use the same exact
                              usage day after day after day, down to
                              hundredth of a percent, conspicuously even
                              when they were on vacation and claimed
                              that they didn’t have job submissions in
                              cron/etc.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">So then, taking a spin of the <a
href="https://lists.schedmd.com/pipermail/slurm-users/2022-December/009514.html"
                                class="" moz-do-not-send="true">scom
                                tui </a>posted this morning, I then
                              filtered that user, and noticed that even
                              though I was only looking 2 days back at
                              job history, I was seeing a job from
                              August.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">Conspicuously, the job state
                              is cancelled, but the job end time is 1y
                              from the start time, meaning its job end
                              time is in 2023.</div>
                            <div class="">So something with the dbd is
                              confused about this/these jobs that are
                              lingering and reporting cancelled but
                              still â€œon the books” somehow until next
                              August.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">
                              <blockquote type="cite" class="">
                                <div class=""><font class=""
                                    face="Menlo">╭──────────────────────────────────────────────────────────────────────────────────────────╮</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job ID Â  Â  Â  Â  Â  Â  Â 
                                    : 290742 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job Name Â  Â  Â  Â  Â  Â 
                                    : $jobname Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â User Â  Â  Â  Â  Â  Â  Â  Â 
                                    : $user Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Group Â  Â  Â  Â  Â  Â  Â 
                                    Â : $user Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job Account Â  Â  Â  Â 
                                    Â : $account Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job Submission Â  Â  Â 
                                    : 2022-08-08 08:44:52 -0400 EDT Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job Start Â  Â  Â  Â  Â 
                                    Â : 2022-08-08 08:46:53 -0400 EDT Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job End Â  Â  Â  Â  Â  Â 
                                    Â : 2023-08-08 08:47:01 -0400 EDT Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job Wait time Â  Â  Â 
                                    Â : 2m1s Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Job Run time Â  Â  Â  Â 
                                    : 8760h0m8s Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Partition Â  Â  Â  Â  Â 
                                    Â : $part Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â Priority Â  Â  Â  Â  Â  Â 
                                    : 127282 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â QoS Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â : $qos Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">│ Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â â”‚</font></div>
                                <div class=""><font class=""
                                    face="Menlo">╰──────────────────────────────────────────────────────────────────────────────────────────╯</font></div>
                                <div class=""><font class=""
                                    face="Menlo">Steps count: 0</font></div>
                              </blockquote>
                              <br class="">
                            </div>
                            <div class="">
                              <blockquote type="cite" class=""><font
                                  class="" face="Menlo">Filter: $user Â 
                                  Â  Â  Â  Items: 13</font></blockquote>
                              <blockquote type="cite" class="">
                                <div class=""><font class=""
                                    face="Menlo"><br class="">
                                  </font></div>
                                <div class=""><font class=""
                                    face="Menlo"> Job ID Â  Â  Â Job Name Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Part. Â QoS
                                    Â  Â  Â  Â  Account Â  Â  User Â  Â  Â  Â  Â  Â 
                                    Nodes Â  Â  Â  Â  Â  Â  Â  Â  State</font></div>
                                <div class=""><font class=""
                                    face="Menlo">───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290714 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node32 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290716 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node24 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290736 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node00 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290742 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node01 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290770 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node02 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290777 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node03 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290793 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node04 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290797 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node05 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290799 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node06 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290801 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node07 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290814 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node08 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290817 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node09 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                                <div class=""><font class=""
                                    face="Menlo"> 290819 Â  Â  Â $jobname Â 
                                    Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  $part
                                    Â $qos Â  Â  Â  Â $acct Â  Â  Â  $user Â  Â  Â 
                                    Â  Â  Â node10 Â  Â  Â  Â  Â  Â  Â  Â CANCELLED</font></div>
                              </blockquote>
                            </div>
                            <div class=""><br class="">
                            </div>
                            <div class="">I’d love to figure out the
                              proper way to either purge these jid’s
                              from the accounting database cleanly, or
                              change the job end/run time to a
                              sane/correct value.</div>
                            <div class="">Slurm is v21.08.8-2, and ntp
                              is a stratum 1 server, so time is in sync
                              everywhere, not that multiple servers
                              would drift 1 year off like this.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">Thanks for any help,</div>
                            <div class="">Reed</div>
                          </blockquote>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br class="">
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br class="">
      </div>
    </blockquote>
  </body>
</html>