[slurm-users] Rate-limiting sbatch and srun
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue Jul 19 08:49:07 UTC 2022
On 7/19/22 08:15, Ole Holm Nielsen wrote:
> On 7/19/22 00:45, gphipps wrote:
>> Everyone so often one of our users accidentally writes a “fork-bomb”
>> that submits thousands of sbatch and srun requests per second. It is a
>> giant DDOS attack on our scheduler. Is there a way of rate limiting
>> these requests before they reach the daemon? I could imagine writing a
>> shim in front of sbatch/srun, but I was hoping there was an official way
>> to do this
>
> Perhaps setting MaxSubmitJobs and MaxJobs on associations and QOSes would
> do the trick?
>
> You may also want to increase the default MaxJobCount in slurm.conf.
>
> See my Wiki page for the details:
> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#maxjobcount-limit
Another possibility would be to write a Job submit Lua plugin to reject
jobs before they get submitted. Of course, you would have to be able to
define some logic which somehow detects the "fork-bomb" situation, which
may not be so easy to do? See
https://slurm.schedmd.com/job_submit_plugins.html
I have some additional pointers to job submit plugins at
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
/Ole
More information about the slurm-users
mailing list