[slurm-users] Rate Limiting of RPC calls
Christopher Samuel
chris at csamuel.org
Wed Feb 10 01:33:42 UTC 2021
On 2/9/21 5:08 pm, Paul Edmon wrote:
> 1. Being on the latest release: A lot of work has gone into improving
> RPC throughput, if you aren't running the latest 20.11 release I highly
> recommend upgrading. 20.02 also was pretty good at this.
We've not gone to 20.11 on production systems yet, but I can vouch for
20.02 being far better than previous versions for scheduling performance.
We also use the cli_filter lua plugin to write our own RPC limiting
mechanism using a local directory for per-user files. The big advantage
of this is that it does the rate limiting client side and so they don't
get sent to the slurmctld in the first place. Yes, it is theoretically
possible for users to discover and work around this, but the intent here
is to catch accidental/naive use rather than anything malicious.
Also getting users to use `sacct` rather than `squeue` to check what
state a job is in can help a lot too, it reduces the load on slurmctld.
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list