[slurm-users] MaxTRESRunMinsPU not yet enabled - similar options?
jstroik at ssec.wisc.edu
jstroik at ssec.wisc.edu
Mon May 20 18:15:51 UTC 2019
Esteemed Slurm users,
I am trying to mitigate a use case where jobs can be submitted for the
maximum number of nodes allowed and for the maximum time slipping in
where the queue itself is briefly empty. The general idea is users are
allowed to use up to half of the nodes in the QoS, and jobs are allowed
to run up to 6 hours for that QoS, but we've seen cases were users
simultaneously request both repeatedly.
What I currently do is this:
- allow users to use up to X nodes simultaneously
- Set the maxwall to Y time
And my dream is to add:
- limit users who request the full X nodes to Y/n time
This would allow users to run the maxwall time where needed or it would
allow them to run on max nodes but it would dissuade them from running
maxwall because then they'd also be limited to a smaller subset of nodes.
A solution that I feel might work well would be to limit the maximum run
minutes each user can submit to the QoS at a time. I attempted to
implement and test this without success this morning. I looked to the
code to try to determine what I was doing wrong and came across this:
/* MaxTRESRunMinsPU doesn't do anything yet, if/when it does
* change the last param in the print_tres_line to 0. */
I did test setting GrpTRESRunMins=cpu=N for each user + account
association, and that does appear to work. Does anyone know of any other
solutions to this issue.
(Of course, we will talk to the users as well - but a reasonable
technical solution is a nice backstop).
Thanks,
Jesse Stroik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3964 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190520/7036d23f/attachment.bin>
More information about the slurm-users
mailing list