<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
I'm not sure I can help with the rest, but the EnforcePartLimits setting will only reject a job at submission time that exceeds
<b>partition</b> limits, not overall cluster limits. I don't see anything, offhand, in the interactive partition definition that is exceeded by your request for 4 GB/CPU.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Rob</div>
<div>
<div><br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
<hr tabindex="-1" style="display:inline-block; width:98%;">
<b>From:</b> slurm-users on behalf of Angel de Vicente<br>
<b>Sent:</b> Monday, July 24, 2023 7:20 AM<br>
<b>To:</b> Slurm User Community List<br>
<b>Subject:</b> [slurm-users] MaxMemPerCPU not enforced?
<div><br>
</div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">Hello,<br>
<br>
I'm trying to get Slurm to control the memory used per CPU, but it does<br>
not seem to enforce the MaxMemPerCPU option in slurm.conf<br>
<br>
This is running in Ubuntu 22.04 (cgroups v2), Slurm 23.02.3.<br>
<br>
Relevant configuration options:<br>
<br>
,----cgroup.conf<br>
| AllowedRAMSpace=100<br>
| ConstrainCores=yes<br>
| ConstrainRAMSpace=yes<br>
| ConstrainSwapSpace=yes<br>
| AllowedSwapSpace=0<br>
`----<br>
<br>
,----slurm.conf<br>
| TaskPlugin=task/affinity,task/cgroup<br>
| PrologFlags=X11<br>
| <br>
| SelectType=select/cons_res<br>
| SelectTypeParameters=CR_CPU_Memory,CR_CORE_DEFAULT_DIST_BLOCK<br>
| MaxMemPerCPU=500<br>
| DefMemPerCPU=200<br>
| <br>
| JobAcctGatherType=jobacct_gather/linux<br>
| <br>
| EnforcePartLimits=ALL<br>
| <br>
| NodeName=xxx RealMemory=257756 Sockets=4 CoresPerSocket=8 ThreadsPerCore=1 Weight=1<br>
| <br>
| PartitionName=batch Nodes=duna State=UP Default=YES MaxTime=2-00:00:00 MaxCPUsPerNode=32 OverSubscribe=FORCE:1<br>
| PartitionName=interactive Nodes=duna State=UP Default=NO MaxTime=08:00:00 MaxCPUsPerNode=32 OverSubscribe=FORCE:2<br>
`----<br>
<br>
<br>
I can ask for an interactive session with 4GB/CPU (I would have thought<br>
that "EnforcePartLimits=ALL" would stop me from doing that), and once<br>
I'm in the interactive session I can execute a 3GB test code without any<br>
issues (I can see with htop that the process does indeed use a RES size<br>
of 3GB at 100% CPU use). Any idea what could be the problem or how to<br>
start debugging this?<br>
<br>
,----<br>
| [angelv@xxx test]$ sinter -n 1 --mem-per-cpu=4000<br>
| salloc: Granted job allocation 127544<br>
| salloc: Nodes xxx are ready for job<br>
| <br>
| (sinter) [angelv@xxx test]$ stress -m 1 -t 600 --vm-keep --vm-bytes 3G<br>
| stress -m 1 -t 600 --vm-keep --vm-bytes 3G<br>
| stress: info: [1772392] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd<br>
`----<br>
<br>
Many thanks,<br>
-- <br>
Ángel de Vicente<br>
Research Software Engineer (Supercomputing and BigData)<br>
Tel.: +34 922-605-747<br>
Web.: <a href="http://research.iac.es/proyecto/polmag/" target="_blank" rel="noopener noreferrer" data-auth="NotApplicable">
http://research.iac.es/proyecto/polmag/</a><br>
<br>
GPG: 0x8BDC390B69033F52<br>
</div>
</span></font></div>
</div>
</body>
</html>