<div dir="ltr"><div class="gmail_default" style="font-family:courier new,monospace">Thanks folks to all who responded!</div><div class="gmail_default" style="font-family:courier new,monospace"><br></div><div class="gmail_default" style="font-family:courier new,monospace">setting <span style="margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;color:rgb(70,84,92)">SelectTypeParameters = </span><span style="margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;color:rgb(70,84,92)">CR_CPU_Memory did the trick.</span></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 23, 2023 at 3:21 AM Shunran Zhang <<a href="mailto:szhang@ngs.gen-info.osaka-u.ac.jp">szhang@ngs.gen-info.osaka-u.ac.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Hi<div dir="auto"><br></div><div dir="auto">Would you mind to check your job scheduling settings in slurm.conf ?</div><div dir="auto"><br></div><div dir="auto">Namely <b style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;color:rgb(70,84,92);background-color:rgb(255,255,255)">SelectTypeParameters = </b><b style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;color:rgb(70,84,92);background-color:rgb(255,255,255)">CR_CPU_Memory </b><span style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;background-color:rgb(255,255,255)">or the like.</span></div><div dir="auto"><span style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;background-color:rgb(255,255,255)"><br></span></div><div dir="auto"><span style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;background-color:rgb(255,255,255)">Also, you may want to use systemd-cgtop to at least confirm jobs are indeed running in cgroups.</span></div><div dir="auto"><span style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;background-color:rgb(255,255,255)"><br></span></div><div dir="auto"><span style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;background-color:rgb(255,255,255)">Sincerely,</span></div><div dir="auto"><span style="font-size:18px;margin:0px;padding:0px;border:0px;line-height:inherit;font-family:"source sans pro",helvetica,arial,sans-serif;vertical-align:baseline;background-color:rgb(255,255,255)">S. Zhang</span></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 23, 2023, 12:07 Boris Yazlovitsky <<a href="mailto:borisyaz@gmail.com" target="_blank">borisyaz@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_default" style="font-family:"courier new",monospace">it's still not constraining memory...</div><div class="gmail_default" style="font-family:"courier new",monospace"><br></div><div class="gmail_default" style="font-family:"courier new",monospace">a memhog job continues to memhog:</div><div class="gmail_default" style="font-family:"courier new",monospace"><br></div><div class="gmail_default" style="font-family:"courier new",monospace">boris@rod:~/scripts$ sacct --starttime=2023-05-01 --format=jobid,user,start,elapsed,reqmem,maxrss,maxvmsize,nodelist,state,exit -j 199<br>JobID User Start Elapsed ReqMem MaxRSS MaxVMSize NodeList State ExitCode <br>------------ --------- ------------------- ---------- ---------- ---------- ---------- --------------- ---------- -------- <br>199 boris 2023-06-23T02:42:30 00:01:21 1M milhouse COMPLETED 0:0 <br>199.batch 2023-06-23T02:42:30 00:01:21 104857988K 104858064K milhouse COMPLETED 0:0 <br></div><div class="gmail_default" style="font-family:"courier new",monospace"><br></div><div class="gmail_default" style="font-family:"courier new",monospace">One thing I noticed is that the machines I'm working on do not have libcgroup and libcgroup-dev installed - but slurm does have its own cgroup implementation? the slurmd processes do utilize /usr/lib/slurm/*cgroup.so objects. I will try to recompile slurm with those cgrouplib packages present.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jun 22, 2023 at 6:04 PM Ozeryan, Vladimir <<a href="mailto:Vladimir.Ozeryan@jhuapl.edu" rel="noreferrer" target="_blank">Vladimir.Ozeryan@jhuapl.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
<div lang="EN-US">
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">No worries,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">No, we don’t have any OS level settings, only “allowed_devices.conf” which just has /dev/random, /dev/tty and stuff like that.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">But I think this could be the culprit, check out man page for cgroup.conf<br>
</span><span style="font-family:"Courier New"">AllowedRAMSpace=100<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New""><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">I would just leave these four:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">CgroupAutomount=yes<br>
ConstrainCores=yes<br>
ConstrainDevices=yes<br>
ConstrainRAMSpace=yes<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Vlad.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:11pt;font-family:Calibri,sans-serif">From:</span></b><span style="font-size:11pt;font-family:Calibri,sans-serif"> slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users-bounces@lists.schedmd.com</a>>
<b>On Behalf Of </b>Boris Yazlovitsky<br>
<b>Sent:</b> Thursday, June 22, 2023 5:40 PM<br>
<b>To:</b> Slurm User Community List <<a href="mailto:slurm-users@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users@lists.schedmd.com</a>><br>
<b>Subject:</b> Re: [slurm-users] [EXT] --mem is not limiting the job's memory<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div id="m_-5780399477648684550m_-8172360048926361758m_-9217506820023448158APLWarningText">
<table border="0" cellspacing="0" cellpadding="0" align="left">
<tbody>
<tr>
<td width="100%" style="width:100%;background:rgb(224,224,224);padding:0in">
<p class="MsoNormal">
<b><span style="color:red">APL external email warning: </span></b><span style="color:black">Verify sender
<a href="mailto:slurm-users-bounces@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users-bounces@lists.schedmd.com</a> before clicking links or attachments</span><u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
<p> <u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">thank you Vlad - looks like we have the same yes's<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">Do you remember if you had to make any settings on the OS level or in the kernel to make it work?<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">-b<u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Thu, Jun 22, 2023 at 5:31 PM Ozeryan, Vladimir <<a href="mailto:Vladimir.Ozeryan@jhuapl.edu" rel="noreferrer" target="_blank">Vladimir.Ozeryan@jhuapl.edu</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Hello,</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">We have the following configured and it seems to be working ok.</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
<p class="MsoNormal" style="margin-bottom:12pt"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">CgroupAutomount=yes<br>
ConstrainCores=yes<br>
ConstrainDevices=yes<br>
ConstrainRAMSpace=yes</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Vlad.</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span style="font-size:11pt;font-family:Calibri,sans-serif">From:</span></b><span style="font-size:11pt;font-family:Calibri,sans-serif"> slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users-bounces@lists.schedmd.com</a>>
<b>On Behalf Of </b>Boris Yazlovitsky<br>
<b>Sent:</b> Thursday, June 22, 2023 4:50 PM<br>
<b>To:</b> Slurm User Community List <<a href="mailto:slurm-users@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users@lists.schedmd.com</a>><br>
<b>Subject:</b> Re: [slurm-users] [EXT] --mem is not limiting the job's memory</span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<div id="m_-5780399477648684550m_-8172360048926361758m_-9217506820023448158m_-3473428741860337749APLWarningText">
<table border="0" cellspacing="0" cellpadding="0" align="left">
<tbody>
<tr>
<td width="100%" style="width:100%;background:rgb(224,224,224);padding:0in">
<p class="MsoNormal">
<b><span style="color:red">APL external email warning: </span></b><span style="color:black">Verify sender
<a href="mailto:slurm-users-bounces@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users-bounces@lists.schedmd.com</a> before clicking links or attachments</span><u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
<p> <u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">Hello Vladimir, thank you for your response.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">this is the cgroups.conf file:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">CgroupAutomount=yes<br>
ConstrainCores=yes<br>
ConstrainDevices=yes<br>
ConstrainRAMSpace=yes<br>
ConstrainSwapSpace=yes<br>
MaxRAMPercent=90<br>
AllowedSwapSpace=0<br>
AllowedRAMSpace=100<br>
MemorySwappiness=0<br>
MaxSwapPercent=0</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">/etc/default/grub:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">GRUB_DEFAULT=0<br>
GRUB_TIMEOUT_STYLE=hidden<br>
GRUB_TIMEOUT=0<br>
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`<br>
GRUB_CMDLINE_LINUX_DEFAULT=""<br>
GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0 cgroup_enable=memory swapaccount=1"</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">what other cgroup settings need to be set?</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">&& thank you!</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">-b</span><u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<div>
<p class="MsoNormal">On Thu, Jun 22, 2023 at 4:02 PM Ozeryan, Vladimir <<a href="mailto:Vladimir.Ozeryan@jhuapl.edu" rel="noreferrer" target="_blank">Vladimir.Ozeryan@jhuapl.edu</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">--mem=5G. Should allocate 5G of memory per node.</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">Are your cgroups configured?</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span style="font-size:11pt;font-family:Calibri,sans-serif">From:</span></b><span style="font-size:11pt;font-family:Calibri,sans-serif"> slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users-bounces@lists.schedmd.com</a>>
<b>On Behalf Of </b>Boris Yazlovitsky<br>
<b>Sent:</b> Thursday, June 22, 2023 3:28 PM<br>
<b>To:</b> <a href="mailto:slurm-users@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users@lists.schedmd.com</a><br>
<b>Subject:</b> [EXT] [slurm-users] --mem is not limiting the job's memory</span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<div id="m_-5780399477648684550m_-8172360048926361758m_-9217506820023448158m_-3473428741860337749m_-2593608553680693805APLWarningText">
<table border="0" cellspacing="0" cellpadding="0" align="left">
<tbody>
<tr>
<td width="100%" style="width:100%;background:rgb(224,224,224);padding:0in">
<p class="MsoNormal">
<b><span style="color:red">APL external email warning: </span></b><span style="color:black">Verify sender
<a href="mailto:slurm-users-bounces@lists.schedmd.com" rel="noreferrer" target="_blank">slurm-users-bounces@lists.schedmd.com</a> before clicking links or attachments</span><u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
<p> <u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">Running slurm 22.03.02 on Ubunutu 22.04 server.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">Jobs submitted with --mem=5g are able to allocate an unlimited amount of memory.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">how to limit on the job submission level how much memory it can grab?</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New"">thanks, and best regards!<br>
Boris</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Courier New""> </span><u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div></blockquote></div>
</blockquote></div>
</blockquote></div>