[slurm-users] sreport User Utilisation Over 300%

mshubham mshubham at cdac.in
Fri Jul 29 08:59:37 UTC 2022


Dear All,
I am facing an issue in SLURM(20.11.8), in which sreport cluster utilization is
100%, and when I run sreport cluster userutilizationbyaccount, Some user
utilisation is greater than 100%, three users including root showing utilisation
over 250%, making overall utilisation 500% (though user has not submitted any
job in past one week)

It was showing some runaway jobs, but we cleared it, then again, it was showing
same runaway jobs, and we cleared it again. (both manually/through command)

Before that, we had encountered an issue in the past in which,  in our  cluster
with primary and backup slurm controller, we kept a common mount point for the
"StateSaveLocation" /var/share/slurm/ctld. Then we observed a strange behaviour
 that " If the mount point is present and the service is restarted on the
primary controller then it replaces all the statesavelocation files."

This resulted in cancellation of all the jobs (running, pending state),
reservations and assigns the JobID from 1 for newly submitted jobs. If the
SateSaveLocation is kept on local file system instead of shared mount point then
everything works fine even after restarting the slurmctld service.

After that issue, utilisation is higher than expected, though it has not
impacted any real job utilisation.

Also, we have removed those user's account in SLURM, yet it is still showing
their utilisation

Please help in resolving this issue.


Thanks and Regards,
Shubham Mehta
HPC Technology
CDAC Pune
------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
------------------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220729/4509b7ea/attachment-0001.htm>


More information about the slurm-users mailing list