[slurm-users] Questions about default_queue_depth

David Henkemeyer david.henkemeyer at gmail.com
Wed Jan 12 17:26:26 UTC 2022


Hello,

A few weeks ago, we tested Slurm against about 50K jobs, and observed at
least one instance where a node went idle, while there were jobs on the
queue that could have run on the idle node.  The best guess as to why this
occurred, at this point, is that the default_queue_depth was set to the
default value of 100, and that the queued jobs were likely not in the first
100 jobs in the queue.  Based on this, I have a few questions:
1) What is a reasonable value for default_queue_depth?  Would 1000 be ok,
in terms of performance?
2) How can we better debug why queued jobs are not being selected?
3) Is there a way to see the order of the jobs in the queue?  Perhaps
squeue lists the jobs in order?
3) If we had several partitions, would the default_queue_dpeth apply to all
partitions?

Thank you
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220112/075f138b/attachment.htm>


More information about the slurm-users mailing list