[slurm-users] ec2 elastic node
    Chris Samuel 
    chris at csamuel.org
       
    Sat Mar 17 06:33:21 MDT 2018
    
    
  
On Thursday, 15 March 2018 6:04:47 PM AEDT Arie Blumenzweig wrote:
> # sinfo
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
> cloud*       up   infinite      1  down* slurm-node0
It looks like Slurm thinks the node was booted, but cannot talk to it.
> [2018-03-13T15:38:21.401] debug2: Error connecting slurm stream socket at
> 172.31.38.99:6818: Connection timed out
Did it possibly boot with that IP address but slurmd was blocked by a firewall?
I've not played with the cloud stuff for a long time but you may need to try:
scontrol update node=slurm-node0 state=POWER_DOWN
to see if that gets it back into its offline state properly to allow it to try 
and by booted again.
Good luck!
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
    
    
More information about the slurm-users
mailing list