[slurm-users] update_node / reason set to: slurm.conf / state set to DRAINED

Kevin Buckley Kevin.Buckley at pawsey.org.au
Thu Nov 5 02:00:41 UTC 2020


We have had a couple of nodes enter a DRAINED state where scontrol
gives the reason as

Reason=slurm.conf

In looking at the SlurmCtlD log we see pairs of lines as follows

  update_node: node nid00245 reason set to: slurm.conf
  update_node: node nid00245 state set to DRAINED

A search of the interweb thing for "Reason=slurm.conf" and
"reason set to: slurm.conf" has so far proved too much for
my search-fu.

The slurm.conf files do match across the Cray nodes so it
seems unlikely that it would be a "something out of sync "
thing.

Other "update_node" actions seen in the logs , eg,

"reason set to: NHC-Admindown"

also put the node into a DRAINED state but then, that's to
be expected.

We upgraded to 20.02.5 a couple of days ago and all of the
occurences of the "reason set to: slurm.conf" message only
appear after the update.


Anyone else seen anything like this, espcially any of you who
have just gone 20.02.5?

Yours, diving into the source soon,
Kevin
-- 
Supercomputing Systems Administrator
Pawsey Supercomputing Centre



More information about the slurm-users mailing list