<div dir="ltr">I think this error usually means that on your node cn7 it has either the wrong /etc/hosts or the wrong /etc/slurm/slurm.conf<div><br></div><div>E.g. try 'srun --nodelist=cn7 ping -c 1 cn7'</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, May 29, 2019 at 6:00 AM Alexander Åhman <<a href="mailto:alexander@ydesign.se">alexander@ydesign.se</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
Have a very strange problem. The cluster has been working just fine <br>
until one node died and now I can't submit jobs to 2 of the nodes using <br>
srun from the login machine. Using sbatch works just fine and also if I <br>
use srun from the same host as slurmctld.<br>
All the other nodes works just fine as they always has, only 2 nodes are <br>
experiencing this problem. Very strange...<br>
<br>
Have checked network connectivity and DNS and that is OK. I can ping, <br>
ssh to all nodes just fine. All nodes are identical and using Slurm 18.08.<br>
Also tested to reboot the 2 nodes and slurmctld but still same problem.<br>
<br>
[alex@li1 ~]$ srun -w cn7 hostname<br>
srun: error: fwd_tree_thread: can't find address for host cn7, check <br>
slurm.conf<br>
srun: error: Task launch for 1088816.0 failed on node cn7: Can't find an <br>
address, check slurm.conf<br>
srun: error: Application launch failed: Can't find an address, check <br>
slurm.conf<br>
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.<br>
srun: error: Timed out waiting for job step to complete<br>
<br>
[alex@li1 ~]$ srun -w cn6 hostname<br>
<a href="http://cn6.ydesign.se" rel="noreferrer" target="_blank">cn6.ydesign.se</a><br>
<br>
What is this error "can't find address for host" about? Have searched <br>
the web but can't find any good information about what the problem is or <br>
what to do to resolve it.<br>
<br>
Any kind soul out there who knows what to do next?<br>
<br>
Regards,<br>
Alexander Åhman<br>
<br>
<br>
</blockquote></div>