Hello

   We have another batch of new users and some more batches of large array jobs with very short runtimes due to errors in the jobs or just by design. Trying to deal with these issues, Setting ArrayTaskThrottle and user education, I had a thought that it would be very nice to have a limit on how many jobs can start in a given minute for users, so if they posted a 200000 array job with 15 second tasks then the scheduler wouldn’t launch more than a 100 or 200 per minute and be less likely to bog down, but if they had longer runtimes (1 hour +) it would take a few extra minutes to start using all the resources they are allowed to but not add much overall delay to the whole set of jobs.

 

I thought about adding something to our CLI filter, but usually these jobs are asking for a runtime of 3-4 hours even though they run for <30 seconds so the submit options don’t indicate the problem jobs ahead of time.

 

We currently limit our users to %80 of the available resources which is way more than slurm needs to bog down with fast turnover jobs, but we have users who complain that they can’t use that other 20% when the cluster is not busy so putting in lower default restrictions is not currently an option.

 

Has this already been discussed and isn’t feasible for technical reasons? (Not finding anything like this yet searching the archives)

 

I think slurm used have a feature request severity on their bug submission site. Is there a severity level they prefer to have suggested requests like this?

 

Thanks