<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">Am 11.08.2020 um 20:55 schrieb Richard
      Lefebvre:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAHuHHxrGsiV66nUcLHO+KOEdzshj-mPfryUzPi5UKpkgHsOYfQ@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div class="gmail_default" style="font-size:small">
          <div class="gmail_default" style="font-size:small">Hi,</div>
          <div class="gmail_default" style="font-size:small"><br>
          </div>
          <div class="gmail_default" style="font-size:small">The command
            "sshare -l" is crashing. I isolated the problem to an
            account which is causing the problem. The problem seems to
            be an extremely large LevelFS in the order of 4.8x10e16. I
            can see the value if I add the "-p" option. Is there a way
            to fix the account? <br>
          </div>
          <div class="gmail_default" style="font-size:small"><br>
          </div>
        </div>
      </div>
    </blockquote>
    <p>I have seen this as well - I did not bother to trace it in the
      code, but I would guess its some underflow problem (when the raw
      usage of the account decays toward zero the LevelFS gets ever
      bigger...)</p>
    <p>It can be fixed by just resetting the account to 'true' zero
      usage</p>
    <p>(sacctmgr modify account NAME set rawusage=0)</p>
    <p>When the next FS recalculation kicks in the huge LevelFS resets
      to 'inf' and the problem goes away.</p>
    <p>Regards,</p>
    <p>Holger N.<br>
    </p>
    <p><br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAHuHHxrGsiV66nUcLHO+KOEdzshj-mPfryUzPi5UKpkgHsOYfQ@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_default" style="font-size:small">
          <div class="gmail_default" style="font-size:small">Below are
            the results of the 2 commands with the "-p" and next the
            crashed command:</div>
          <div class="gmail_default" style="font-size:small"><br>
          </div>
          <div class="gmail_default" style="font-size:small">sshare -l
            -p --account=group001_cpu<br>
Account|User|RawShares|NormShares|RawUsage|NormUsage|EffectvUsage|FairShare|LevelFS|GrpTRESMins|TRESRunMins|<br>
group001_cpu||650216|0.003724|0|0.000000|0.000000||48285673640776424.000000||cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0|<br>
          </div>
          <div class="gmail_default" style="font-size:small"><br>
          </div>
          <div class="gmail_default" style="font-size:small">sshare -l
            --account=group001_cpu<br>
                         Account       User  RawShares  NormShares  
             RawUsage   NormUsage  EffectvUsage  FairShare    LevelFS  
                             GrpTRESMins                    TRESRunMins
            <br>
            -------------------- ---------- ---------- -----------
            ----------- ----------- ------------- ---------- ----------
            ------------------------------
            ------------------------------ <br>
            *** Error in `sshare': free(): invalid next size (fast):
            0x0000000000eff280 ***<br>
            ======= Backtrace: =========<br>
            /lib64/libc.so.6(+0x81679)[0x7efd0e82a679]<br>
/opt/software/slurm/lib64/slurm/libslurmfull.so(slurm_xfree+0x1d)[0x7efd0fcb9009]<br>
/opt/software/slurm/lib64/slurm/libslurmfull.so(print_fields_double+0x2d6)[0x7efd0fc02a08]<br>
            sshare(process+0x51c)[0x4024c9]<br>
            sshare[0x40292c]<br>
            sshare(main+0xa2d)[0x40337f]<br>
            /lib64/libc.so.6(__libc_start_main+0xf5)[0x7efd0e7cb505]<br>
            sshare[0x401da9]<br>
            ======= Memory map: ========<br>
            00400000-00405000 r-xp 00000000 00:2f 51577                
                         /opt/software/slurm/bin/sshare<br>
            00604000-00605000 r--p 00004000 00:2f 51577                
                         /opt/software/slurm/bin/sshare<br>
            00605000-00606000 rw-p 00005000 00:2f 51577                
                         /opt/software/slurm/bin/sshare<br>
            00ee3000-00f23000 rw-p 00000000 00:00 0                    
                         [heap]<br>
            7efd08000000-7efd08021000 rw-p 00000000 00:00 0 <br>
            7efd08021000-7efd0c000000 ---p 00000000 00:00 0 <br>
            7efd0d564000-7efd0d579000 r-xp 00000000 00:24 61849        
                         /usr/lib64/libgcc_s-4.8.5-20150702.so.1<br>
            7efd0d579000-7efd0d778000 ---p 00015000 00:24 61849        
                         /usr/lib64/libgcc_s-4.8.5-20150702.so.1<br>
            7efd0d778000-7efd0d779000 r--p 00014000 00:24 61849        
                         /usr/lib64/libgcc_s-4.8.5-20150702.so.1<br>
            7efd0d779000-7efd0d77a000 rw-p 00015000 00:24 61849        
                         /usr/lib64/libgcc_s-4.8.5-20150702.so.1<br>
            7efd0d77a000-7efd0d783000 r-xp 00000000 00:24 66799        
                         /usr/lib64/libmunge.so.2.0.0<br>
            7efd0d783000-7efd0d982000 ---p 00009000 00:24 66799        
                         /usr/lib64/libmunge.so.2.0.0<br>
            7efd0d982000-7efd0d983000 r--p 00008000 00:24 66799        
                         /usr/lib64/libmunge.so.2.0.0<br>
            7efd0d983000-7efd0d984000 rw-p 00009000 00:24 66799        
                         /usr/lib64/libmunge.so.2.0.0<br>
            7efd0d984000-7efd0d987000 r-xp 00000000 00:2f 51448        
                         /opt/software/slurm/lib64/slurm/auth_munge.so<br>
            7efd0d987000-7efd0db86000 ---p 00003000 00:2f 51448        
                         /opt/software/slurm/lib64/slurm/auth_munge.so<br>
            7efd0db86000-7efd0db87000 r--p 00002000 00:2f 51448        
                         /opt/software/slurm/lib64/slurm/auth_munge.so<br>
            7efd0db87000-7efd0db88000 rw-p 00003000 00:2f 51448        
                         /opt/software/slurm/lib64/slurm/auth_munge.so<br>
            7efd0db88000-7efd0e38d000 r--s 00000000 00:24 191641        
                        /var/lib/sss/mc/passwd<br>
            7efd0e38d000-7efd0e395000 r-xp 00000000 00:24 66184        
                         /usr/lib64/libnss_sss.so.2<br>
            7efd0e395000-7efd0e594000 ---p 00008000 00:24 66184        
                         /usr/lib64/libnss_sss.so.2<br>
            7efd0e594000-7efd0e595000 r--p 00007000 00:24 66184        
                         /usr/lib64/libnss_sss.so.2<br>
            7efd0e595000-7efd0e596000 rw-p 00008000 00:24 66184        
                         /usr/lib64/libnss_sss.so.2<br>
            7efd0e596000-7efd0e5a2000 r-xp 00000000 00:24 62229        
                         /usr/lib64/<a
              href="http://libnss_files-2.17.so" target="_blank"
              moz-do-not-send="true">libnss_files-2.17.so</a><br>
            7efd0e5a2000-7efd0e7a1000 ---p 0000c000 00:24 62229        
                         /usr/lib64/<a
              href="http://libnss_files-2.17.so" target="_blank"
              moz-do-not-send="true">libnss_files-2.17.so</a><br>
            7efd0e7a1000-7efd0e7a2000 r--p 0000b000 00:24 62229        
                         /usr/lib64/<a
              href="http://libnss_files-2.17.so" target="_blank"
              moz-do-not-send="true">libnss_files-2.17.so</a><br>
            7efd0e7a2000-7efd0e7a3000 rw-p 0000c000 00:24 62229        
                         /usr/lib64/<a
              href="http://libnss_files-2.17.so" target="_blank"
              moz-do-not-send="true">libnss_files-2.17.so</a><br>
            7efd0e7a3000-7efd0e7a9000 rw-p 00000000 00:00 0 <br>
            7efd0e7a9000-7efd0e96c000 r-xp 00000000 00:24 62154        
                         /usr/lib64/<a href="http://libc-2.17.so"
              target="_blank" moz-do-not-send="true">libc-2.17.so</a><br>
            7efd0e96c000-7efd0eb6c000 ---p 001c3000 00:24 62154        
                         /usr/lib64/<a href="http://libc-2.17.so"
              target="_blank" moz-do-not-send="true">libc-2.17.so</a><br>
            7efd0eb6c000-7efd0eb70000 r--p 001c3000 00:24 62154        
                         /usr/lib64/<a href="http://libc-2.17.so"
              target="_blank" moz-do-not-send="true">libc-2.17.so</a><br>
            7efd0eb70000-7efd0eb72000 rw-p 001c7000 00:24 62154        
                         /usr/lib64/<a href="http://libc-2.17.so"
              target="_blank" moz-do-not-send="true">libc-2.17.so</a><br>
            7efd0eb72000-7efd0eb77000 rw-p 00000000 00:00 0 <br>
            7efd0eb77000-7efd0eb8e000 r-xp 00000000 00:24 62349        
                         /usr/lib64/<a href="http://libpthread-2.17.so"
              target="_blank" moz-do-not-send="true">libpthread-2.17.so</a><br>
            7efd0eb8e000-7efd0ed8d000 ---p 00017000 00:24 62349        
                         /usr/lib64/<a href="http://libpthread-2.17.so"
              target="_blank" moz-do-not-send="true">libpthread-2.17.so</a><br>
            7efd0ed8d000-7efd0ed8e000 r--p 00016000 00:24 62349        
                         /usr/lib64/<a href="http://libpthread-2.17.so"
              target="_blank" moz-do-not-send="true">libpthread-2.17.so</a><br>
            7efd0ed8e000-7efd0ed8f000 rw-p 00017000 00:24 62349        
                         /usr/lib64/<a href="http://libpthread-2.17.so"
              target="_blank" moz-do-not-send="true">libpthread-2.17.so</a><br>
            7efd0ed8f000-7efd0ed93000 rw-p 00000000 00:00 0 <br>
            7efd0ed93000-7efd0edb8000 r-xp 00000000 00:24 62205        
                         /usr/lib64/libtinfo.so.5.9<br>
            7efd0edb8000-7efd0efb8000 ---p 00025000 00:24 62205        
                         /usr/lib64/libtinfo.so.5.9<br>
            7efd0efb8000-7efd0efbc000 r--p 00025000 00:24 62205        
                         /usr/lib64/libtinfo.so.5.9<br>
            7efd0efbc000-7efd0efbd000 rw-p 00029000 00:24 62205        
                         /usr/lib64/libtinfo.so.5.9<br>
            7efd0efbd000-7efd0efe3000 r-xp 00000000 00:24 62147        
                         /usr/lib64/libncurses.so.5.9<br>
            7efd0efe3000-7efd0f1e2000 ---p 00026000 00:24 62147        
                         /usr/lib64/libncurses.so.5.9<br>
            7efd0f1e2000-7efd0f1e3000 r--p 00025000 00:24 62147        
                         /usr/lib64/libncurses.so.5.9<br>
            7efd0f1e3000-7efd0f1e4000 rw-p 00026000 00:24 62147        
                         /usr/lib64/libncurses.so.5.9<br>
            7efd0f1e4000-7efd0f1ec000 r-xp 00000000 00:24 62410        
                         /usr/lib64/libhistory.so.6.2<br>
            7efd0f1ec000-7efd0f3eb000 ---p 00008000 00:24 62410        
                         /usr/lib64/libhistory.so.6.2<br>
            7efd0f3eb000-7efd0f3ec000 r--p 00007000 00:24 62410        
                         /usr/lib64/libhistory.so.6.2<br>
            7efd0f3ec000-7efd0f3ed000 rw-p 00008000 00:24 62410        
                         /usr/lib64/libhistory.so.6.2<br>
            7efd0f3ed000-7efd0f429000 r-xp 00000000 00:24 62408        
                         /usr/lib64/libreadline.so.6.2<br>
            7efd0f429000-7efd0f629000 ---p 0003c000 00:24 62408        
                         /usr/lib64/libreadline.so.6.2<br>
            7efd0f629000-7efd0f62b000 r--p 0003c000 00:24 62408        
                         /usr/lib64/libreadline.so.6.2<br>
            7efd0f62b000-7efd0f631000 rw-p 0003e000 00:24 62408        
                         /usr/lib64/libreadline.so.6.2<br>
            7efd0f631000-7efd0f633000 rw-p 00000000 00:00 0 <br>
            7efd0f633000-7efd0f734000 r-xp 00000000 00:24 62170        
                         /usr/lib64/<a href="http://libm-2.17.so"
              target="_blank" moz-do-not-send="true">libm-2.17.so</a><br>
            7efd0f734000-7efd0f933000 ---p 00101000 00:24 62170        
                         /usr/lib64/<a href="http://libm-2.17.so"
              target="_blank" moz-do-not-send="true">libm-2.17.so</a><br>
            7efd0f933000-7efd0f934000 r--p 00100000 00:24 62170        
                         /usr/lib64/<a href="http://libm-2.17.so"
              target="_blank" moz-do-not-send="true">libm-2.17.so</a><br>
            7efd0f934000-7efd0f935000 rw-p 00101000 00:24 62170        
                         /usr/lib64/<a href="http://libm-2.17.so"
              target="_blank" moz-do-not-send="true">libm-2.17.so</a><br>
            7efd0f935000-7efd0f937000 r-xp 00000000 00:24 62166        
                         /usr/lib64/libdl-2.17.sogroup001_cpu          
                   650216    0.003712           0    0.000000    
             0.000000            4.8104e+16 Aborted</div>
        </div>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Dr. Holger Naundorf
Christian-Albrechts-Universität zu Kiel
Rechenzentrum / HPC / Server und Storage
Tel: +49 431 880-1990
Fax:  +49 431 880-1523
<a class="moz-txt-link-abbreviated" href="mailto:naundorf@rz.uni-kiel.de">naundorf@rz.uni-kiel.de</a></pre>
  </body>
</html>