Hello,
We've got a few nodes defined in our slurm.conf in 'FUTURE' state as it's a new hardware type we're working on brining into service.
The nodes are currently all allocated to a dedicated partition. The partition is configured as 'state=UP'. As we've built the new nodes and started slurmd+munge, they've appeared in an idle state in the new partition as expected. All good so far.
However if the slurmctld is restarted the nodes go back to being in 'FUTURE' state, and do not transition to idle, accept jobs etc.
The slurm daemon on the new nodes can clearly still talk to the slurmctld, s* commands on the new nodes work as expected but remain in FUTURE state - until slurmd on each node is restarted.
I could have misunderstood something about the FUTURE state but I was expecting them to go back to idle; I understand that slurmctld doesn't communicate out to nodes in FUTURE state but I at least expected them to be picked up when they communicate _in_ to the slurmctld.
Is this expected behaviour or perhaps a bug? The reason I've defined the new nodes this way so I don't have to update slurm.conf and restart slurmctld as each is built, but can do that as a single job once everything is finished, however it seems less useful if they can 'disappear' from the cluster as far as users are concerned.
Cheers, Steve