<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Jeffrey,<br>
<br>
<br>
thanks for the hint regarding scontrol reconfig. That one drove me
nuts again. <br>
I changed it to MaxArraySize=100000. I restartet slurmctld, since i
also changed some features of the nodes.<br>
<br>
I soon realized, that I only could submit --array=1-99999, I then
already myself increased MaxArraySize to 100001 and did an scontrol
reconfig.<br>
<br>
Behaviour was still the same. Now, I know why :)<br>
<br>
<br>
Best,<br>
Marcus<br>
<br>
<div class="moz-cite-prefix">On 2/26/19 3:27 PM, Jeffrey Frey wrote:<br>
</div>
<blockquote type="cite"
cite="mid:FA290C81-D6AC-49F6-BBBF-C40EA1E002DA@udel.edu">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div class="">Also see "<a
href="https://slurm.schedmd.com/slurm.conf.html" class=""
moz-do-not-send="true">https://slurm.schedmd.com/slurm.conf.html</a>"
for MaxArraySize/MaxJobCount.</div>
<div class=""><br class="">
</div>
We just went through a user-requested adjustment to MaxArraySize
to bump it from 1000 to 10000; as the documentation states, since
each index of an array job is essentially "a job," you must be
sure to also adjust MaxJobCount (from 10000 to 100000 in our
case). Adjusting MaxJobCount requires a restart of slurmctld;
though the documentation doesn't state it, so does adjustment of
MaxArraySize (scontrol reconfigure will succeed but leave the
previous limit in effect, see "<a
href="https://bugs.schedmd.com/show_bug.cgi?id=6553" class=""
moz-do-not-send="true">https://bugs.schedmd.com/show_bug.cgi?id=6553</a>").
<div class="">
<div class=""><br class="">
</div>
<div class="">The "MaxArraySize" is a bit of a misnomer since
it's really 1 + the top of the valid range of indices --
"MaxArrayIndex" would be more apt. Our users were very happy
with Grid Engine's allowance of any index range and striding
that produces no more than "max_aj_tasks" indices; since
moving to Slurm they're forced to come up with their own
index-mapping functionality at times, but the relatively low
MaxArraySize versus what we had in GridEngine (75000) has been
especially frustrating for them.</div>
<div class=""><br class="">
</div>
<div class="">So far the 10000/100000 combo hasn't come close to
exhausting resources on our slurmctld nodes; but we haven't
actually submitted a couple 10000-index array jobs and enough
other jobs to hit 100000 active jobs, so current memory usage
isn't an adequate measure of usage under load. Since the
slurm.conf documentation states:</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
</div>
<blockquote style="margin: 0 0 0 40px; border: none; padding:
0px;" class="">
<div class="">
<div class="">Performance can suffer with more than a few
hundred thousand jobs. </div>
</div>
</blockquote>
<div class="">
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">we're reluctant to increase MaxJobCount too much
higher.</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
<blockquote type="cite" class="">On Feb 26, 2019, at 3:18 AM,
Ole Holm Nielsen <<a
href="mailto:Ole.H.Nielsen@fysik.dtu.dk" class=""
moz-do-not-send="true">Ole.H.Nielsen@fysik.dtu.dk</a>>
wrote:<br class="">
<br class="">
On 2/26/19 9:07 AM, Marcus Wagner wrote:<br class="">
<blockquote type="cite" class="">Does anyone know, why per
default the number of array elements is limited to 1000?<br
class="">
We have one user, who would like to have 100k array
elements!<br class="">
What is more difficult for the scheduler, one array job
with 100k elements or 100k non-array jobs?<br class="">
Where did you set the limit? Do your users use array jobs
at all?<br class="">
</blockquote>
<br class="">
Google is your friend :-)<br class="">
<br class="">
<a href="https://slurm.schedmd.com/job_array.html" class=""
moz-do-not-send="true">https://slurm.schedmd.com/job_array.html</a><br
class="">
<br class="">
<blockquote type="cite" class="">A new configuration
parameter has been added to control the maximum job array
size: MaxArraySize. The smallest index that can be
specified by a user is zero and the maximum index is
MaxArraySize minus one. The default value of MaxArraySize
is 1001. The maximum MaxArraySize supported in Slurm is
4000001. Be mindful about the value of MaxArraySize as job
arrays offer an easy way for users to submit large numbers
of jobs very quickly.<br class="">
</blockquote>
<br class="">
/Ole<br class="">
<br class="">
</blockquote>
<br class="">
<div class=""><br class="">
::::::::::::::::::::::::::::::::::::::::::::::::::::::<br
class="">
Jeffrey T. Frey, Ph.D.<br class="">
Systems Programmer V / HPC Management<br class="">
Network & Systems Services / College of Engineering<br
class="">
University of Delaware, Newark DE 19716<br class="">
Office: (302) 831-6034 Mobile: (302) 419-4976<br class="">
::::::::::::::::::::::::::::::::::::::::::::::::::::::<br
class="">
<br class="">
<br class="">
<br class="">
</div>
<br class="">
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
<a class="moz-txt-link-abbreviated" href="mailto:wagner@itc.rwth-aachen.de">wagner@itc.rwth-aachen.de</a>
<a class="moz-txt-link-abbreviated" href="http://www.itc.rwth-aachen.de">www.itc.rwth-aachen.de</a>
</pre>
</body>
</html>