sinfo history

List overview All Threads
Download

newer

older

SLUG 25?

Slurm version 25.05.2 is now...

Steve Kirk

28 Jul 2025 28 Jul '25

9:48 a.m.

Hi,

Am I correct in thinking that the history of a *node* as shown by sinfo isn't stored anywhere by Slurm?

Interested to know if slurm can tell me historically when a node was draining,drained etc.

Regards, Steve

Show replies by date

Paul Edmon

28 Jul 28 Jul

9:58 a.m.

Correct. What we do is that we have prometheus collectors running which pull node state so we can graph it over time.

https://github.com/fasrc/prometheus-slurm-exporter

-Paul Edmon-

On 7/28/25 12:48 PM, Steve Kirk via slurm-users wrote:

...

Hi,

Am I correct in thinking that the history of a *node* as shown by sinfo isn't stored anywhere by Slurm?

Interested to know if slurm can tell me historically when a node was draining,drained etc.

Regards, Steve

Michael Gutteridge

9:58 a.m.

I think the events you're looking for would be tracked in the events tables in the accounting database:

sacctmgr show event where node=<nodename>

-- Michael

On Mon, Jul 28, 2025 at 9:55 AM Steve Kirk via slurm-users < slurm-users@lists.schedmd.com> wrote:

...

Hi,

Am I correct in thinking that the history of a *node* as shown by sinfo isn't stored anywhere by Slurm?

Interested to know if slurm can tell me historically when a node was draining,drained etc.

Regards, Steve

-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com

Christopher Samuel

5:17 p.m.

On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:

...

I think the events you're looking for would be tracked in the events tables in the accounting database:

Be aware that down and drainED nodes are there, but not drainING.

So (unless something has changed in 25.05) until a draining node is empty of jobs it doesn't get recorded in slurmdbd's events table.

All the best, Chris

-- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA

Ole Holm Nielsen

11:58 p.m.

On 7/29/25 02:17, Christopher Samuel via slurm-users wrote:

...

On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:

...
I think the events you're looking for would be tracked in the events tables in the accounting database:

Thanks, "sacctmgr show event where node=<nodename>" is extremely useful for monitoring nodes, and I wasn't aware of this command. I've added some further examples to my Wiki page now at https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_operations/#listing-node-eve...

...

Be aware that down and drainED nodes are there, but not drainING.

So (unless something has changed in 25.05) until a draining node is empty of jobs it doesn't get recorded in slurmdbd's events table.

So the sacctmgr manual page is not quite correct when it states "event: Events like downed or draining nodes on clusters." I've opened a ticket https://support.schedmd.com/show_bug.cgi?id=23337 suggesting a documentation update.

Best regards, Ole

Ole Holm Nielsen

29 Jul 29 Jul

1:41 a.m.

On 7/29/25 08:58, Ole Holm Nielsen wrote:

...

On 7/29/25 02:17, Christopher Samuel via slurm-users wrote:

...
On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:

Thanks, "sacctmgr show event where node=<nodename>" is extremely useful for monitoring nodes, and I wasn't aware of this command. I've added some further examples to my Wiki page now at https:// eur01.safelinks.protection.outlook.com/? url=https%3A%2F%2Fwiki.fysik.dtu.dk%2FNiflheim_system%2FSlurm_operations%2F%23listing-node-events&data=05%7C02%7COle.H.Nielsen%40fysik.dtu.dk%7C6571d26860a24f0755fa08ddce6d5473%7Cf251f123c9ce448e927734bb285911d9%7C0%7C0%7C638893691141746858%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=e6wWHvnvnausmanpKqnTPevTWcafDliBAKvMjYzLhtI%3D&reserved=0

If you're interested in a general node status, I've added the "sacctmgr show event" command to my shownode script: https://github.com/OleHolmNielsen/Slurm_tools/blob/master/nodes/shownode

/Ole

Steve Kirk

11 Aug 11 Aug

5:18 a.m.

On Tue, 2025-07-29 at 08:58 +0200, Ole Holm Nielsen via slurm-users wrote:

...

On 7/29/25 02:17, Christopher Samuel via slurm-users wrote:

...
On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:

...
I think the events you're looking for would be tracked in the events tables in the accounting database:

Thanks, "sacctmgr show event where node=<nodename>" is extremely useful for monitoring nodes, and I wasn't aware of this command. I've added some further examples to my Wiki page now at https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_operations/#listing-node-eve...

Thanks for the replies; I was also not aware of that command and now feel like I should have read the documentation better! That wiki is also a nice resource.

...

...
Be aware that down and drainED nodes are there, but not drainING.

Noted; I think down and drained will give me what I'm looking for. We do have monitoring of all our cluster that likely has the information but this gives me something I use quickly from within the cluster etc.

Cheers, Steve

205

Age (days ago)

219

Last active (days ago)

slurm-users@lists.schedmd.com

6 comments

5 participants

tags (0)

participants (5)

Christopher Samuel
Michael Gutteridge
Ole Holm Nielsen
Paul Edmon
Steve Kirk