[slurm-users] Question about having 2 partitions that are mutually exclusive, but have unexpected interactions
toomuchit at gmail.com
Thu May 12 16:10:39 UTC 2022
I suspect you have too low of a setting for "MaxJobCount"
The maximum number of jobs SLURM can have in its active database
at one time. Set the values of*MaxJobCount* and*MinJobAge* to
insure the slurmctld daemon does not exhaust its memory or other
resources. Once this limit is reached, requests to submit
additional jobs will fail. The default value is 5000 jobs. This
value may not be reset via "scontrol reconfig". It only takes
effect upon restart of the slurmctld daemon. May not exceed
so if you already have (by default) 5000 jobs being considered, the
remaining aren't even looked at.
On 5/12/2022 7:34 AM, David Henkemeyer wrote:
> Question for the braintrust:
> I have 3 partitions:
> * Partition A_highpri: 80 nodes
> * Partition A_lowpri: same 80 nodes
> * Partition B_lowpri: 10 different nodes
> There is no overlap between A and B partitions.
> Here is what I'm observing. If I fill the queue with ~20-30k jobs for
> partition A_highpri, and several thousand to partition A_lowpri, then,
> a bit later, submit jobs to partition B_lowpri, I am observing that
> the Partition B jobs _are queued and not running right away, and are
> given a pending reason of "Priority"_, which doesn't seem right to me.
> Yes, there are higher priority jobs pending in the queue (the jobs
> bound for A_hi), but there aren't any higher priority jobs pending
> /for the same partition/ as the Partition B jobs, so theoretically,
> these partition B jobs should not be held up. Eventually, the
> scheduler gets around to scheduling them, but it seems to take a while
> for the scheduler (which is probably pretty busy dealing with
> job starts, job stops, etc) to figure this out.
> If I schedule fewer jobs to the A partitions ( ~3k jobs ), then the
> scheduler schedules the PartitionB jobs much faster, as expected. As
> I increase from 3k, then partition B jobs get held up longer and longer.
> I can raise the priority on partition B, and that does solve the
> problem, but I don't want those jobs to impact the partition A_lowpri
> jobs. In fact, _I don't want any cross-partition influence_.
> I'm hoping there is a slurm parameter I can tweak to make slurm
> recognize that these partition B jobs shouldn't ever have a pending
> state of "priority". Or to treat these as 2 separate queues. Or
> something like that. Spinning up a 2nd slurm controller is not ideal
> for us (uless there is a lightweight method to do it).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users