[slurm-users] maximum size of array jobs

Marcus Wagner wagner at itc.rwth-aachen.de
Tue Feb 26 15:05:22 UTC 2019

Hi Jeffrey,

thanks for the hint regarding scontrol reconfig. That one drove me nuts 
I changed it to MaxArraySize=100000. I restartet slurmctld, since i also 
changed some features of the nodes.

I soon realized, that I only could submit --array=1-99999, I then 
already myself increased MaxArraySize to 100001 and did an scontrol 

Behaviour was still the same. Now, I know why :)


On 2/26/19 3:27 PM, Jeffrey Frey wrote:
> Also see "https://slurm.schedmd.com/slurm.conf.html" for 
> MaxArraySize/MaxJobCount.
> We just went through a user-requested adjustment to MaxArraySize to 
> bump it from 1000 to 10000; as the documentation states, since each 
> index of an array job is essentially "a job," you must be sure to also 
> adjust MaxJobCount (from 10000 to 100000 in our case). 
>  Adjusting MaxJobCount requires a restart of slurmctld; though the 
> documentation doesn't state it, so does adjustment of MaxArraySize 
> (scontrol reconfigure will succeed but leave the previous limit in 
> effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553").
> The "MaxArraySize" is a bit of a misnomer since it's really 1 + the 
> top of the valid range of indices -- "MaxArrayIndex" would be more 
> apt.  Our users were very happy with Grid Engine's allowance of any 
> index range and striding that produces no more than "max_aj_tasks" 
> indices; since moving to Slurm they're forced to come up with their 
> own index-mapping functionality at times, but the relatively low 
> MaxArraySize versus what we had in GridEngine (75000) has been 
> especially frustrating for them.
> So far the 10000/100000 combo hasn't come close to exhausting 
> resources on our slurmctld nodes; but we haven't actually submitted a 
> couple 10000-index array jobs and enough other jobs to hit 100000 
> active jobs, so current memory usage isn't an adequate measure of 
> usage under load.  Since the slurm.conf documentation states:
>     Performance can suffer with more than a few hundred thousand jobs.
> we're reluctant to increase MaxJobCount too much higher.
>> On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen 
>> <Ole.H.Nielsen at fysik.dtu.dk <mailto:Ole.H.Nielsen at fysik.dtu.dk>> wrote:
>> On 2/26/19 9:07 AM, Marcus Wagner wrote:
>>> Does anyone know, why per default the number of array elements is 
>>> limited to 1000?
>>> We have one user, who would like to have 100k array elements!
>>> What is more difficult for the scheduler, one array job with 100k 
>>> elements or 100k non-array jobs?
>>> Where did you set the limit? Do your users use array jobs at all?
>> Google is your friend :-)
>> https://slurm.schedmd.com/job_array.html
>>> A new configuration parameter has been added to control the maximum 
>>> job array size: MaxArraySize. The smallest index that can be 
>>> specified by a user is zero and the maximum index is MaxArraySize 
>>> minus one. The default value of MaxArraySize is 1001. The maximum 
>>> MaxArraySize supported in Slurm is 4000001. Be mindful about the 
>>> value of MaxArraySize as job arrays offer an easy way for users to 
>>> submit large numbers of jobs very quickly.
>> /Ole
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
> Jeffrey T. Frey, Ph.D.
> Systems Programmer V / HPC Management
> Network & Systems Services / College of Engineering
> University of Delaware, Newark DE  19716
> Office: (302) 831-6034  Mobile: (302) 419-4976
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::

Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190226/5acf4c5d/attachment-0001.html>

More information about the slurm-users mailing list