[slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU
Loris Bennett
loris.bennett at fu-berlin.de
Tue Dec 14 13:16:52 UTC 2021
Hi Ole,
Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
> The latest pestat version now adds a red color highlight if the GRES GPU is the
> (null) value.
>
> We use this to highlight jobs on GPU nodes which didn't request any GPU
> resources, thereby possibly wasting resources.
>
> Could you test if this is useful and give me a feedback?
In job_submit.lua we check whether a job sent to the GPU partition has
actually requested a GPU as a TRES and, if not, reject it. So that kind
of wastage doesn't occur.
However, we do sometimes push non-GPU jobs onto GPU-nodes within a
scavenger partition, so it would be handy if pestat highlighted these.
At the moment, though, there are no such jobs, so I can't test.
It would however be good to be able to display the utilisation of the
GPUs via the command-line. Some people request GPUs, but the jobs don't
manage to use them very much. At the opposite end of the usage
spectrum, today, via our Zabbix monitoring, I spotted some jobs with an
unusually high GPU-efficiencies which turned out to be doing
cryptomining :-/
Cheers,
Loris
--
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin Email loris.bennett at fu-berlin.de
More information about the slurm-users
mailing list