[slurm-users] Problem launching interactive jobs using srun
Andy Georges
Andy.Georges at UGent.be
Fri Mar 9 14:45:20 MST 2018
Hi,
> On 9 Mar 2018, at 21:58, Nicholas McCollum <nmccollum at asc.edu> wrote:
>
> Connection refused makes me think a firewall issue.
>
> Assuming this is a test environment, could you try on the compute node:
>
> # iptables-save > iptables.bak
> # iptables -F && iptables -X
>
> Then test to see if it works. To restore the firewall use:
>
> # iptables-restore < iptables.bak
>
> You may have to use...
>
> # systemctl stop firewalld
> # systemctl start firewalld
>
> If you use firewalld.
We’re using shorewall …
There is an srun process listening on the login node:
srun 8500 vsc40075 13u IPv4 597473 0t0 TCP *:36506 (LISTEN)
And slurmd on the worker node is trying to connect to it
[2018-03-09T22:00:44.908] [47.0] debug4: adding IO connection (logical node rank 0)
[2018-03-09T22:00:44.908] [47.0] debug4: connecting IO back to 10.141.21.202:36506
[2018-03-09T22:00:44.908] [47.0] debug: _oom_event_monitor: started.
[2018-03-09T22:00:44.908] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.908] [47.0] debug3: Error connecting, picking new stream port
[2018-03-09T22:00:44.909] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.909] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.909] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.909] [47.0] debug2: Error connecting slurm stream socket at 10.141.21.202:36506: Connection refused
[2018-03-09T22:00:44.909] [47.0] error: connect io: Connection refused
Opening ports 30000-50000 seems to do the trick. Will try to figure out what’s different on the other machines.
Thanks for the pointers and help!
— Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180309/142e8d92/attachment.sig>
More information about the slurm-users
mailing list