[slurm-users] maximum size of array jobs
Jeffrey Frey
frey at udel.edu
Tue Feb 26 14:27:00 UTC 2019
Also see "https://slurm.schedmd.com/slurm.conf.html" for MaxArraySize/MaxJobCount.
We just went through a user-requested adjustment to MaxArraySize to bump it from 1000 to 10000; as the documentation states, since each index of an array job is essentially "a job," you must be sure to also adjust MaxJobCount (from 10000 to 100000 in our case). Adjusting MaxJobCount requires a restart of slurmctld; though the documentation doesn't state it, so does adjustment of MaxArraySize (scontrol reconfigure will succeed but leave the previous limit in effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553").
The "MaxArraySize" is a bit of a misnomer since it's really 1 + the top of the valid range of indices -- "MaxArrayIndex" would be more apt. Our users were very happy with Grid Engine's allowance of any index range and striding that produces no more than "max_aj_tasks" indices; since moving to Slurm they're forced to come up with their own index-mapping functionality at times, but the relatively low MaxArraySize versus what we had in GridEngine (75000) has been especially frustrating for them.
So far the 10000/100000 combo hasn't come close to exhausting resources on our slurmctld nodes; but we haven't actually submitted a couple 10000-index array jobs and enough other jobs to hit 100000 active jobs, so current memory usage isn't an adequate measure of usage under load. Since the slurm.conf documentation states:
Performance can suffer with more than a few hundred thousand jobs.
we're reluctant to increase MaxJobCount too much higher.
> On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> wrote:
>
> On 2/26/19 9:07 AM, Marcus Wagner wrote:
>> Does anyone know, why per default the number of array elements is limited to 1000?
>> We have one user, who would like to have 100k array elements!
>> What is more difficult for the scheduler, one array job with 100k elements or 100k non-array jobs?
>> Where did you set the limit? Do your users use array jobs at all?
>
> Google is your friend :-)
>
> https://slurm.schedmd.com/job_array.html
>
>> A new configuration parameter has been added to control the maximum job array size: MaxArraySize. The smallest index that can be specified by a user is zero and the maximum index is MaxArraySize minus one. The default value of MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm is 4000001. Be mindful about the value of MaxArraySize as job arrays offer an easy way for users to submit large numbers of jobs very quickly.
>
> /Ole
>
::::::::::::::::::::::::::::::::::::::::::::::::::::::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE 19716
Office: (302) 831-6034 Mobile: (302) 419-4976
::::::::::::::::::::::::::::::::::::::::::::::::::::::
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190226/28395071/attachment-0001.html>
More information about the slurm-users
mailing list