[slurm-users] Advice on using GrpTRESRunMin=cpu=<limit>
David Baker
D.J.Baker at soton.ac.uk
Wed Feb 12 16:45:32 UTC 2020
Hello,
Before implementing "GrpTRESRunMin=cpu=limit" on our production cluster I'm doing some tests on the development cluster. I've only get a handful of compute nodes to play without and so I have set the limit sensibly low. That is, I've set the limit to be 576,000. That's equivalent to 400 CPU-days. In other words, I can potentially submit the following job...
1 x 2 nodes x 80 cpus/node x 2.5 days = 400 CPU-days
I submitted a set of jobs requesting 2 nodes, 80 cpus/node for 2.5 days. The first day is running and the rest are in the queue -- what I see makes sense...
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
677 debug myjob djb1 PD 0:00 2 (AssocGrpCPURunMinutesLimit)
678 debug myjob djb1 PD 0:00 2 (AssocGrpCPURunMinutesLimit)
679 debug myjob djb1 PD 0:00 2 (AssocGrpCPURunMinutesLimit)
676 debug myjob djb1 R 12:52 2 navy[54-55]
On the other hand, I expected these jobs not to accrue priority, however they do appear to be (see sprio below). I'm working with Slurm v19.05.2. Have I missed something vital/important in the config? We hoped that the queued jobs would not accrue priority. We haven't, for example, used "accrue always". Have I got that wrong? Could someone please advise us.
Best regards,
David
[root at navy51 slurm]# sprio
JOBID PARTITION PRIORITY SITE AGE FAIRSHARE JOBSIZE QOS
677 debug 5551643 100000 1644 450000 5000000 0
678 debug 5551643 100000 1644 450000 5000000 0
679 debug 5551642 100000 1643 450000 5000000 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200212/bb2a82a0/attachment-0001.htm>
More information about the slurm-users
mailing list