<html style="direction: ltr;">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
</head>
<body bidimailui-charset-is-forced="true" style="direction: ltr;"
text="#000000" bgcolor="#FFFFFF">
<p>Hi,</p>
<p><br>
</p>
<p>I'd like to allow job suspension in my cluster, without the
"penalty" of RAM utilization. The jobs are sometimes very big and
can require ~100GB mem on each node. Suspending such a job would
usually mean almost nothing else can run on the same node, except
for very small memory jobs.</p>
<p>Currently the solution is requeue preemptionĀ with or without
checkpointing.</p>
<p>I don't want to use swap for running jobs, ever - I'd rather get
OOM killed than use swap while the job is running.<br>
</p>
<p><br>
</p>
<p>Is there a way to tell Slurm to allocate swap and use it only for
suspending, to allow preemption without terminating the jobs?</p>
<p><br>
</p>
<p>The nodes haveĀ ~TB of disk space each, and most jobs never
utilize any of that (relying on shared storage instead), so local
disk space is usually not a concern.</p>
<p><br>
</p>
<p>Using swap to store suspended jobs, while slow to freeze and
thaw, seems o me to be a better localized solution than
checkpointing and requeuing, allowing the job to resume
"immediately" (sans disk io times) after the high priority job
finishes, but if I'm mistaken, please enlighten me.</p>
<p><br>
</p>
<p>I was wandering if simply setting a large swap in linux, while
setting AllowedSwapSpace=0 in cgroup.conf would work, but I
suspect the following:</p>
<p>1. Even suspended, the job still remains in it's cgroup limits,
and</p>
<p>2. Which process gets swapped is non-deterministic from my point
of view - I'm not sure the kernel will swap out the suspended job
rather than the new job, at least in it's early stages.<br>
</p>
<pre class="moz-signature" cols="72">Thanks in advance,
--Dani_L.
</pre>
</body>
</html>