<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from text --><style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
</head>
<body>
<meta content="text/html; charset=UTF-8">
<style type="text/css" style="">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div dir="ltr">
<div id="x_divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<p>Hi Loris</p>
<p><br>
</p>
<p>indeed <span style="color:rgb(33,33,33); font-family:wf_segoe-ui_normal,"Segoe UI","Segoe WP",Tahoma,Arial,sans-serif,serif,EmojiFont; font-size:13.3333px"> </span><a href="https://slurm.schedmd.com/resource_limits.html" target="_blank" rel="noopener noreferrer" id="LPlnk73428" style="font-family:wf_segoe-ui_normal,"Segoe UI","Segoe WP",Tahoma,Arial,sans-serif,serif,EmojiFont; font-size:13.3333px">https://slurm.schedmd.com/resource_limits.html</a> explains
the possibilities of limitations</p>
<p><br>
</p>
<p>At present time, I do no limit memory for specific users, but just a global limitation in slurm.conf:</p>
<p> <span><i>MaxMemPerNode=65536</i> (for 64 GB limitation) </span></p>
<p><span><br>
</span></p>
<p><span>But... anyway, for my Slurm version 20.02, any user can obtain MORE than 64 GB of memory by using the "--mem=0" option !</span></p>
<p><span>So I had to filter this in <span style="font-family:Calibri,Helvetica,sans-serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols; font-size:16px"> job_submit.lua </span></span></p>
<p><span><span style="font-family:Calibri,Helvetica,sans-serif,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols; font-size:16px"><br>
</span></span></p>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of Loris Bennett <loris.bennett@fu-berlin.de><br>
<b>Sent:</b> Thursday, December 8, 2022 10:57:56 AM<br>
<b>To:</b> Slurm User Community List<br>
<b>Subject:</b> Re: [slurm-users] srun --mem issue</font>
<div> </div>
</div>
</div>
<font size="2"><span style="font-size:10pt;">
<div class="PlainText">Loris Bennett <loris.bennett@fu-berlin.de> writes:<br>
<br>
> Moshe Mergy <moshe.mergy@weizmann.ac.il> writes:<br>
><br>
>> Hi Sandor<br>
>><br>
>> I personnaly block "--mem=0" requests in file job_submit.lua (slurm 20.02):<br>
>><br>
>> if (job_desc.min_mem_per_node == 0 or job_desc.min_mem_per_cpu == 0) then<br>
>> slurm.log_info("%s: ERROR: unlimited memory requested", log_prefix) <br>
>> slurm.log_info("%s: ERROR: job %s from user %s rejected because of an invalid (unlimited) memory request.", log_prefix, job_desc.name, job_desc.user_name)
<br>
>> slurm.log_user("Job rejected because of an invalid memory request.") <br>
>> return slurm.ERROR<br>
>> end<br>
><br>
> What happens if somebody explicitly requests all the memory, so in<br>
> Sandor's case --mem=500G ?<br>
><br>
>> Maybe there is a better or nicer solution...<br>
<br>
Can't you just use account and QOS limits:<br>
<br>
<a href="https://slurm.schedmd.com/resource_limits.html">https://slurm.schedmd.com/resource_limits.html</a><br>
<br>
?<br>
<br>
And anyway, what is the use-case for preventing someone using all the<br>
memory? In our case, if someone really need all the memory, they should be able<br>
to have it. <br>
<br>
However, I do have a chronic problem with users requesting too much<br>
memory. My approach has been to try to get people to use 'seff' to see<br>
what resources their jobs in fact need. In addition each month we<br>
generate a graphical summary of 'seff' data for each user, like the one<br>
shown here<br>
<br>
<a href="https://www.fu-berlin.de/en/sites/high-performance-computing/Dokumentation/Statistik">
https://www.fu-berlin.de/en/sites/high-performance-computing/Dokumentation/Statistik</a><br>
<br>
and automatically send an email to those with a large percentage of<br>
resource-inefficient jobs telling them to look at their graphs and<br>
correct their resource requirements for future jobs.<br>
<br>
Cheers,<br>
<br>
Loris<br>
<br>
>> All the best<br>
>> Moshe<br>
>><br>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br>
>> From: slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of Felho, Sandor <Sandor.Felho@transunion.com><br>
>> Sent: Wednesday, December 7, 2022 7:03 PM<br>
>> To: slurm-users@lists.schedmd.com<br>
>> Subject: [slurm-users] srun --mem issue <br>
>> <br>
>> TransUnion is running a ten-node site using slurm with multiple queues. We have an issue with --mem parameter. The is one user who has read the slurm manual and found the<br>
>> --mem=0. This is giving the maximum memory on the node (500 GiB's) for the single job. How can I block a --mem=0 request?<br>
>><br>
>> We are running:<br>
>><br>
>> * OS: RHEL 7<br>
>> * cgroups version 1<br>
>> * slurm: 19.05<br>
>><br>
>> Thank you,<br>
>><br>
>> Sandor Felho <br>
>><br>
>> Sr Consultant, Data Science & Analytics <br>
>><br>
-- <br>
Dr. Loris Bennett (Herr/Mr)<br>
ZEDAT, Freie Universität Berlin<br>
<br>
</div>
</span></font>
</body>
</html>