[slurm-users] ticking time bomb? launching too many jobs in parallel

Guillaume Perrault Archambault gperr050 at uottawa.ca
Sat Aug 31 15:12:57 UTC 2019


Hi Steven,

Thanks for your help.

Looks like QOS is the way to go if I want both job arrays + user limits on
jobs/resources (in the context of a regression-test).

Regards,
Guillaume.

On Fri, Aug 30, 2019 at 6:11 PM Steven Dick <kg4ydw at gmail.com> wrote:

> On Fri, Aug 30, 2019 at 2:58 PM Guillaume Perrault Archambault
> <gperr050 at uottawa.ca> wrote:
> > My problem with that though, is what if each script (the 9 scripts in my
> earlier example) each require different requirements? For example, run on a
> different partition, or set a different time limit? My understanding is
> that for a single job array, each job will get the same job requirements.
>
> That's a little messier and may be less suitable for an array job.
> However, some of that can be accomplished.   You can for instance,
> submit a job to multiple partitions and then use srun within the job
> to allocate resources to individual tasks within the job.
> But you get a lot less control over how the resources are spread, so
> it might not be workable.
>
> > The other problem is that with the way I've implemented it, I can change
> the max jobs dynamically.
>
> Others have indicated in this thread that qos can be dynamically
> changed; I don't recall trying that, but if you did, I think you'd do
> it with scontrol.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190831/3f1d0796/attachment.htm>


More information about the slurm-users mailing list