Hello,
We've got a few nodes defined in our slurm.conf in 'FUTURE' state as it's a new hardware type we're working on brining into service.
The nodes are currently all allocated to a dedicated partition. The partition is configured as 'state=UP'. As we've built the new nodes and started slurmd+munge, they've appeared in an idle state in the new partition as expected. All good so far.
However if the slurmctld is restarted the nodes go back to being in 'FUTURE' state, and do not transition to idle, accept jobs etc.
The slurm daemon on the new nodes can clearly still talk to the slurmctld, s* commands on the new nodes work as expected but remain in FUTURE state - until slurmd on each node is restarted.
I could have misunderstood something about the FUTURE state but I was expecting them to go back to idle; I understand that slurmctld doesn't communicate out to nodes in FUTURE state but I at least expected them to be picked up when they communicate _in_ to the slurmctld.
Is this expected behaviour or perhaps a bug? The reason I've defined the new nodes this way so I don't have to update slurm.conf and restart slurmctld as each is built, but can do that as a single job once everything is finished, however it seems less useful if they can 'disappear' from the cluster as far as users are concerned.
Cheers, Steve
I think you have to remove them from the FUTURE state in slurm.conf.
Afternoon,
On Fri, 2025-09-26 at 09:06 +0200, Bjørn-Helge Mevik via slurm-users wrote:
I think you have to remove them from the FUTURE state in slurm.conf.
It does seem that way, and I intend to when work is complete, but that also seems to limit the usefulness of the FUTURE state if I still have to update slurm.conf each time and 'scontrol reconfgure'.
I guess it's fine for nodes we're testing. Perhaps the solution is to ask SchedMD to add the behavoir to the docs/man page and leave it up to the user to determine what they're happy to use FUTURE for.
Cheers, Steve