<div dir="ltr"><div><div>We are using cgroups to track resource usage of our jobs. The jobs are run in docker with docker's --parent-cgroup flag pointing at the slurm job's cgroup. This works great for limiting memory usage.<br><br></div>Unfortunately the maximum memory usage, maxRSS, is not accurately reported in sacct. While the cgroup's memory.max_usage_in_bytes does show accurate numbers.<br><br></div>Looking at the cgroup:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="text-align:left">/sys/fs/cgroup/memory/slurm/uid_500/job_31626/memory.max_usage_in_bytes:1132154880 # 1GB<br>/sys/fs/cgroup/memory/slurm/uid_500/job_31626/memory.use_hierarchy:1<br>/sys/fs/cgroup/memory/slurm/uid_500/job_31626/memory.stat:rss 0<br>/sys/fs/cgroup/memory/slurm/uid_500/job_31626/memory.stat:total_rss 524288</div></blockquote><p class="MsoNormal"><span lang="EN-US"><span lang="EN-US"><br></span></span></p><p class="MsoNormal"><span lang="EN-US"><span lang="EN-US">Looking at sacct:<br>
</span></span></p><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><p class="MsoNormal"><span lang="EN-US">$ sacct -j 31626 -o jobid,AveRSS,MaxRSS,AveVMSize,MaxVMSize,ReqMem,TotalCPU</span></p><p class="MsoNormal"><span lang="EN-US"> JobID AveRSS MaxRSS MaxVMSize</span></p>31626.batch 28600K 28600K 77900K</blockquote><div><br></div><div>I expected that we would get some of the cgroup stats since we are using cgroup plugins.<br><br></div><div>lines from slurm.conf<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><p class="MsoNormal"><span>JobAcctGatherFrequency=30</span></p><p class="MsoNormal"><span lang="EN-US">JobAcctGatherType=jobacct_gather/cgroup</span></p><p class="MsoNormal"><span lang="EN-US">ProctrackType=proctrack/cgroup</span><span lang="EN-US"></span></p><p class="MsoNormal"><span lang="EN-US">TaskPlugin=task/affinity,task/cgroup</span></p><p class="MsoNormal"><span lang="EN-US">SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK<br></span></p></blockquote><div><br></div><div> cgroup.conf<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><p class="MsoNormal"><span lang="EN-US">CgroupAutomount=yes</span></p><p class="MsoNormal"><span lang="EN-US">CgroupMountpoint=/sys/fs/cgroup</span></p><p class="MsoNormal"><span lang="EN-US"> </span></p><p class="MsoNormal"><span lang="EN-US">### Task/cgroup Plugin ###</span></p><p class="MsoNormal"><span lang="EN-US"># Constrain allowed cores to the subset of allocated resources.</span></p><p class="MsoNormal"><span lang="EN-US"># This functionality makes use of the cpuset subsystem</span></p><p class="MsoNormal"><span lang="EN-US">ConstrainCores=yes</span></p><p class="MsoNormal"><span lang="EN-US">ConstrainKmemSpace=yes</span></p><p class="MsoNormal"><span lang="EN-US">ConstrainRAMSpace=yes</span></p><p class="MsoNormal"><span lang="EN-US">ConstrainSwapSpace=yes</span></p><p class="MsoNormal"><span lang="EN-US">ConstrainDevices=no</span></p><p class="MsoNormal"><span lang="EN-US">MinKmemSpace=30</span></p><p class="MsoNormal"><span lang="EN-US">MinRAMSpace=30</span></p><p class="MsoNormal"><span lang="EN-US"># Set a default task affinity to bind each step task to a subset of the</span></p><p class="MsoNormal"><span lang="EN-US"># allocated cores using sched_setaffinity</span></p><p class="MsoNormal"><span lang="EN-US"># /!\ This feature requires the Portable Hardware Locality (hwloc) library</span></p><p class="MsoNormal"><span>TaskAffinity=no</span></p></blockquote>
</div>
</div>
</div>