[slurm-users] Account Usage Discrepancies
John Roberts
roberts.johneric at gmail.com
Mon Nov 27 15:06:28 MST 2017
Hoping someone will get eyes on this one. I ended up changing the partition
in question to only use 1 thread per core to keep things simple, but it
would still be nice to know why slurm is looking at TRES hours instead of
RawUsage.
thanks.
-John
On Wed, Nov 15, 2017 at 10:55 AM, John Roberts <roberts.johneric at gmail.com>
wrote:
> Hi,
>
> I'm having an issue with accounts in slurm and not sure if I'm missing
> something. Here's a quick breakdown of the issue:
>
> We have many accounts in Slurm (v16.05.10) / SlurmDBD. We recently set 1
> partition's billing weight to 0.25. This partition has 64 cores with 4
> threads per node. We set this weight to 0.25 so we don't bill for threads,
> just core hours. This part seems to be working ok.
>
> When querying the account balance via RawUsage (and we use sbank to
> present this to the user in readable hours), these numbers look right. They
> come out to a quarter of full node.
>
> However, when querying say "UserUtilizationByAccount", this number is
> about 4 times as much. This also makes sense because they are technically
> being allocated for all cores and threads, but we only expect to bill for a
> quarter of the time.
>
> The problem arose when a user of this account tried to submit a job and it
> sat in the queue with the error "AssocGrpCPUMinutesLimit".
>
> Turning up the debug logs showed this:
>
> "debug2: Job 161868 being held, the job is at or exceeds assoc
> 2159(<foo>/(null)/(null)) group max tres(cpu) minutes of 150000000 of which
> 27718972 are still available but request is for 94371840 (plus 0 already in
> use) tres minutes (request tres count 65536)"
>
> The available number above "27718972" matches what the balance would have
> been from the max CPU minutes minus the usage from
> "UserUtilizationByAccount" instead of reporting the real balance of 4x that
> number.
>
> Why would Slurm be trying to schedule jobs based on this number instead of
> RawUsage? If we're billing it lower, RawUsage should be the true balance,
> but that doesn't seem to be the case.
>
> thanks!
> -John
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171127/b0629b21/attachment.html>
More information about the slurm-users
mailing list