<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Try: <br>
    </p>
    <p>    sacctmgr list runawayjobs</p>
    <p>Brian Andrus<br>
    </p>
    <div class="moz-cite-prefix">On 12/20/2022 7:54 AM, Reed Dier wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:069A5B5A-CC57-46B8-9CDE-095CA83D7C83@focusvq.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      Hoping this is a fairly simple one.
      <div class=""><br class="">
      </div>
      <div class="">This is a small internal cluster that we’ve been
        using for about 6 months now, and we’ve had some infrastructure
        instability in that time, which I think may be the root culprit
        behind this weirdness, but hopefully someone can point me in the
        direction to solve the issue.</div>
      <div class=""><br class="">
      </div>
      <div class="">I do a daily email of sreport to show how busy the
        cluster was, and who were the top users.</div>
      <div class="">Weirdly, I have a user that seems to be able to use
        the same exact usage day after day after day, down to hundredth
        of a percent, conspicuously even when they were on vacation and
        claimed that they didn’t have job submissions in cron/etc.</div>
      <div class=""><br class="">
      </div>
      <div class="">So then, taking a spin of the <a
href="https://lists.schedmd.com/pipermail/slurm-users/2022-December/009514.html"
          class="" moz-do-not-send="true">scom tui </a>posted this
        morning, I then filtered that user, and noticed that even though
        I was only looking 2 days back at job history, I was seeing a
        job from August.</div>
      <div class=""><br class="">
      </div>
      <div class="">Conspicuously, the job state is cancelled, but the
        job end time is 1y from the start time, meaning its job end time
        is in 2023.</div>
      <div class="">So something with the dbd is confused about
        this/these jobs that are lingering and reporting cancelled but
        still “on the books” somehow until next August.</div>
      <div class=""><br class="">
      </div>
      <div class="">
        <blockquote type="cite" class="">
          <div class=""><font class="" face="Menlo">╭──────────────────────────────────────────────────────────────────────────────────────────╮</font></div>
          <div class=""><font class="" face="Menlo">│                  
                                                                       
                           │</font></div>
          <div class=""><font class="" face="Menlo">│  Job ID          
                  : 290742                                              
                          │</font></div>
          <div class=""><font class="" face="Menlo">│  Job Name        
                  : $jobname                                            
                          │</font></div>
          <div class=""><font class="" face="Menlo">│  User            
                  : $user                                              
                           │</font></div>
          <div class=""><font class="" face="Menlo">│  Group            
                 : $user                                                
                         │</font></div>
          <div class=""><font class="" face="Menlo">│  Job Account      
                 : $account                                            
                          │</font></div>
          <div class=""><font class="" face="Menlo">│  Job Submission  
                  : 2022-08-08 08:44:52 -0400 EDT                      
                           │</font></div>
          <div class=""><font class="" face="Menlo">│  Job Start        
                 : 2022-08-08 08:46:53 -0400 EDT                        
                         │</font></div>
          <div class=""><font class="" face="Menlo">│  Job End          
                 : 2023-08-08 08:47:01 -0400 EDT                        
                         │</font></div>
          <div class=""><font class="" face="Menlo">│  Job Wait time    
                 : 2m1s                                                
                          │</font></div>
          <div class=""><font class="" face="Menlo">│  Job Run time    
                  : 8760h0m8s                                          
                           │</font></div>
          <div class=""><font class="" face="Menlo">│  Partition        
                 : $part                                                
                         │</font></div>
          <div class=""><font class="" face="Menlo">│  Priority        
                  : 127282                                              
                          │</font></div>
          <div class=""><font class="" face="Menlo">│  QoS              
                 : $qos                                                
                          │</font></div>
          <div class=""><font class="" face="Menlo">│                  
                                                                       
                           │</font></div>
          <div class=""><font class="" face="Menlo">│                  
                                                                       
                           │</font></div>
          <div class=""><font class="" face="Menlo">╰──────────────────────────────────────────────────────────────────────────────────────────╯</font></div>
          <div class=""><font class="" face="Menlo">Steps count: 0</font></div>
        </blockquote>
        <br class="">
      </div>
      <div class="">
        <blockquote type="cite" class=""><font class="" face="Menlo">Filter:
            $user         Items: 13</font></blockquote>
        <blockquote type="cite">
          <div class=""><font class="" face="Menlo"><br class="">
            </font></div>
          <div class=""><font class="" face="Menlo"> Job ID      Job
              Name                             Part.  QoS        
              Account     User             Nodes                 State</font></div>
          <div class=""><font class="" face="Menlo">───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────</font></div>
          <div class=""><font class="" face="Menlo"> 290714    
               $jobname                             $part  $qos      
               $acct       $user            node32              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290716    
               $jobname                             $part  $qos      
               $acct       $user            node24              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290736    
               $jobname                             $part  $qos      
               $acct       $user            node00              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290742    
               $jobname                             $part  $qos      
               $acct       $user            node01              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290770    
               $jobname                             $part  $qos      
               $acct       $user            node02              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290777    
               $jobname                             $part  $qos      
               $acct       $user            node03              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290793    
               $jobname                             $part  $qos      
               $acct       $user            node04              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290797    
               $jobname                             $part  $qos      
               $acct       $user            node05              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290799    
               $jobname                             $part  $qos      
               $acct       $user            node06              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290801    
               $jobname                             $part  $qos      
               $acct       $user            node07              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290814    
               $jobname                             $part  $qos      
               $acct       $user            node08              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290817    
               $jobname                             $part  $qos      
               $acct       $user            node09              
               CANCELLED</font></div>
          <div class=""><font class="" face="Menlo"> 290819    
               $jobname                             $part  $qos      
               $acct       $user            node10              
               CANCELLED</font></div>
        </blockquote>
      </div>
      <div class=""><br class="">
      </div>
      <div class="">I’d love to figure out the proper way to either
        purge these jid’s from the accounting database cleanly, or
        change the job end/run time to a sane/correct value.</div>
      <div class="">Slurm is v21.08.8-2, and ntp is a stratum 1 server,
        so time is in sync everywhere, not that multiple servers would
        drift 1 year off like this.</div>
      <div class=""><br class="">
      </div>
      <div class="">Thanks for any help,</div>
      <div class="">Reed</div>
    </blockquote>
  </body>
</html>