[slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU

Loris Bennett loris.bennett at fu-berlin.de
Tue Dec 14 13:16:52 UTC 2021


Hi Ole,

Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:

> The latest pestat version now adds a red color highlight if the GRES GPU is the
> (null) value.
>
> We use this to highlight jobs on GPU nodes which didn't request any GPU
> resources, thereby possibly wasting resources.
>
> Could you test if this is useful and give me a feedback?

In job_submit.lua we check whether a job sent to the GPU partition has
actually requested a GPU as a TRES and, if not, reject it.  So that kind
of wastage doesn't occur.

However, we do sometimes push non-GPU jobs onto GPU-nodes within a
scavenger partition, so it would be handy if pestat highlighted these.
At the moment, though, there are no such jobs, so I can't test.

It would however be good to be able to display the utilisation of the
GPUs via the command-line.  Some people request GPUs, but the jobs don't
manage to use them very much.  At the opposite end of the usage
spectrum, today, via our Zabbix monitoring, I spotted some jobs with an
unusually high GPU-efficiencies which turned out to be doing
cryptomining :-/

Cheers,

Loris

-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de



More information about the slurm-users mailing list