[slurm-users] speed / efficiency of sacct vs. scontrol
Chris Samuel
chris at csamuel.org
Sat Feb 25 18:51:23 UTC 2023
On 23/2/23 2:55 am, David Laehnemann wrote:
> And consequently, would using `scontrol` thus be the better default
> option (as opposed to `sacct`) for repeated job status checks by a
> workflow management system?
Many others have commented on this, but use of scontrol in this way is
really really bad because of the impact it has on slurmctld. This is
because responding to the RPC (IIRC) requires taking read locks on
internal data structures and on a large, busy system (like ours, we
recently rolled over slurm job IDs back to 1 after ~6 years of operation
and run at over 90% occupancy most of the time) this can really damage
scheduling performance.
We've had numerous occasions where we've had to track down users abusing
scontrol in this way and redirect them to use sacct instead.
We already use the cli filter abilities in Slurm to impose a form of
rate limiting on RPCs from other commands, but unfortunately scontrol is
not covered by that.
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list