[slurm-users] Rate Limiting of RPC calls

Christopher Samuel chris at csamuel.org
Wed Feb 10 01:33:42 UTC 2021


On 2/9/21 5:08 pm, Paul Edmon wrote:

> 1. Being on the latest release: A lot of work has gone into improving 
> RPC throughput, if you aren't running the latest 20.11 release I highly 
> recommend upgrading.  20.02 also was pretty good at this.

We've not gone to 20.11 on production systems yet, but I can vouch for 
20.02 being far better than previous versions for scheduling performance.

We also use the cli_filter lua plugin to write our own RPC limiting 
mechanism using a local directory for per-user files. The big advantage 
of this is that it does the rate limiting client side and so they don't 
get sent to the slurmctld in the first place.  Yes, it is theoretically 
possible for users to discover and work around this, but the intent here 
is to catch accidental/naive use rather than anything malicious.

Also getting users to use `sacct` rather than `squeue` to check what 
state a job is in can help a lot too, it reduces the load on slurmctld.

All the best,
Chris
-- 
   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



More information about the slurm-users mailing list