<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Can anyone shed some light on where the _virtual_ memory limit comes from? We're getting jobs killed with the message<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class="">slurmstepd: error: Step 3664.0 exceeded virtual memory limit (79348101120 > 72638634393), being killed<br class=""></div></blockquote><div class="">Is this a limit that's dictated by cgroup.conf or by some srun option (like --mem-per-cpu? And where could this number come from on a machine that has 64 GB nodes, DefMemPerCPU for the partition is 64 GB / 32 (threads), and cgroup.conf has AllowedSwapSpace=75. </div><div class=""><br class=""></div><div class="">And a couple of related questions:</div><div class="">1. If I define DefMemPerCPU in the partition line, and the job doesn't request anything else, what memory measure should expect this to be the limit on? RSS?</div><div class=""><br class=""></div><div class="">2. In general, what's the right way to disable swapping by default, but allow individual jobs to request to be allowed to swap?</div><div class=""><br class=""></div><div class=""><span class="Apple-tab-span" style="white-space:pre"> </span>thanks,</div><div class=""><span class="Apple-tab-span" style="white-space:pre"> </span>Noam<br class=""><div class=""><br class=""></div></div></body></html>