[slurm-users] ec2 elastic node
Chris Samuel
chris at csamuel.org
Sat Mar 17 06:33:21 MDT 2018
On Thursday, 15 March 2018 6:04:47 PM AEDT Arie Blumenzweig wrote:
> # sinfo
> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
> cloud* up infinite 1 down* slurm-node0
It looks like Slurm thinks the node was booted, but cannot talk to it.
> [2018-03-13T15:38:21.401] debug2: Error connecting slurm stream socket at
> 172.31.38.99:6818: Connection timed out
Did it possibly boot with that IP address but slurmd was blocked by a firewall?
I've not played with the cloud stuff for a long time but you may need to try:
scontrol update node=slurm-node0 state=POWER_DOWN
to see if that gets it back into its offline state properly to allow it to try
and by booted again.
Good luck!
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
More information about the slurm-users
mailing list