[slurm-users] MaxTRESRunMinsPU not yet enabled - similar options?

jstroik at ssec.wisc.edu jstroik at ssec.wisc.edu
Mon May 20 18:15:51 UTC 2019


Esteemed Slurm users,

I am trying to mitigate a use case where jobs can be submitted for the 
maximum number of nodes allowed and for the maximum time slipping in 
where the queue itself is briefly empty. The general idea is users are 
allowed to use up to half of the nodes in the QoS, and jobs are allowed 
to run up to 6 hours for that QoS, but we've seen cases were users 
simultaneously request both repeatedly.

What I currently do is this:

  - allow users to use up to X nodes simultaneously
  - Set the maxwall to Y time

And my dream is to add:

  - limit users who request the full X nodes to Y/n time

This would allow users to run the maxwall time where needed or it would 
allow them to run on max nodes but it would dissuade them from running 
maxwall because then they'd also be limited to a smaller subset of nodes.

A solution that I feel might work well would be to limit the maximum run 
minutes each user can submit to the QoS at a time. I attempted to 
implement and test this without success this morning. I looked to the 
code to try to determine what I was doing wrong and came across this:

     /* MaxTRESRunMinsPU doesn't do anything yet, if/when it does
      * change the last param in the print_tres_line to 0. */

I did test setting GrpTRESRunMins=cpu=N for each user + account 
association, and that does appear to work. Does anyone know of any other 
solutions to this issue.

(Of course, we will talk to the users as well - but a reasonable 
technical solution is a nice backstop).

Thanks,
Jesse Stroik

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3964 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190520/7036d23f/attachment.bin>


More information about the slurm-users mailing list