[slurm-users] GrpTRESRunMins and submitting to multiple partitions
Nathan R.M. Crawford
nrcrawfo at uci.edu
Fri May 11 16:19:09 MDT 2018
Hi All,
I'm trying out using GrpTRESRunMins to prevent users from
opportunistically flooding an empty partition with long jobs. We have a
partition set up for each CPU type, and give each association
(account/user/partition) a separate limit based on that account's share of
the partition.
It seems to work as expected, except when a job is submitted to multiple
partitions. We had a few jobs getting blocked because only one partition
would be over the limit. The blocking partition was alphabetically first,
so I'm guessing that the GrpTRESRunMins check doesn't attempt to look at
the others after one fails.
This is with slurm 17.11.4. I haven't dug around in the code, but didn't
see relevant changes in the changelog for 17.11.6.
It's being used as a secondary backstop for abuse, so the limits aren't
hit often, but suggestions for a fix/work-around would be welcome!
Thanks,
Nate
--
Dr. Nathan Crawford nathan.crawford at uci.edu
Modeling Facility Director
Department of Chemistry
1102 Natural Sciences II Office: 2101 Natural Sciences II
University of California, Irvine Phone: 949-824-4508
Irvine, CA 92697-2025, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180511/bfc09bbd/attachment.html>
More information about the slurm-users
mailing list