[slurm-users] Sreport: given wrong/weird results?
Thiago Abdo
tjabdo at inf.ufpr.br
Mon Dec 16 12:46:23 UTC 2019
Hi,
I built a small testing cluster before I can put in production, I was
testing the sreport capabilities and it is showing some inconsistencies
(or maybe/probably I miss understood something).
In this Friday our virtual machines was offline, so I would expect
sreport to give 0 for all users, but some how some of then have
something reported.
I have used this command:
sreport cluster -T all AccountUtilizationByUser users=userb1,usera1
start=2019-12-13T00:00 end=2019-12-13T23:59:59
And I have checked with sacct, there is some jobs submitted in the queue
from when all machines was turnoff but no job was run (as expect).
So how can I have something in sreport?
I also was reading/researching a way to retrieve the usage of the
cluster by it users, I have expected to do it using the billing tres,
but it has two problems. As it is stored as integer I am losing some
information, there is a way of storing with floating point?
For some reason this is also showing some inconsistency, for some reason
in some users the billing tres is the sum of cpu and node usage and for
others it is just cpu usage. I would expect it to be the max(cpu,mem) as
I set the TRESBillingWeights="CPU=1.0,Mem=0.56G" in both my test
partitions(I have only this two partitions) and also
PriorityFlags=MAX_TRES in the config. What am I missing?
(
Some background of my tests, maybe it is helpfull:
I have two users, one using a cpu intensive task, he is using a full
node (4 procs) and his billing tres is equal to the cpu tres
the other user running a memory intensive, so it is using half of the
node cpu and (almost)full memory(7300M) and his billing tres is the sum
of cpu and node tres.
I would expect both of this users to have almost if not the same billing
tres as they are using a full node.
)
Thank you,
Thiago
More information about the slurm-users
mailing list