[slurm-users] How to map slurm node state to "meningful state "

Marcin Stolarek stolarek.marcin at gmail.com
Tue Mar 20 08:38:50 MDT 2018


In our environment we're getting various statisting to grafana, where we
have dashboards designed for IT team (either to be used as one displayed on
TV or something we use from time to time to foresee future limitations or
unsed resources ), but we also have dashboards for our management to help
them evaluate our job.

One of the plots presents nodes states, where we have to translate slurm
node state to something meaningful for someone who is not a slurm expert.
For example preseinting draining nodes in different "meaningful state" in
case of "job still running there" or "reason suggest that it was drained by
healthcheck and will be resumed automatically" etc. Do you have any
experience with similar situation? How do you translate DRAIN+MIX+NoResp to
human beeing?

I'm interested in every aspect, like should Istart from scontrol show node
or sinfo, what are the names of states you ended up with?

cheers,
Marcin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180320/4fb4ca90/attachment.html>


More information about the slurm-users mailing list