[slurm-users] Problem launching interactive jobs using srun

Andy Georges Andy.Georges at UGent.be
Fri Mar 9 14:45:20 MST 2018


Hi,



> On 9 Mar 2018, at 21:58, Nicholas McCollum <nmccollum at asc.edu> wrote:
> 
> Connection refused makes me think a firewall issue.
> 
> Assuming this is a test environment, could you try on the compute node:
> 
> # iptables-save > iptables.bak
> # iptables -F && iptables -X
> 
> Then test to see if it works.  To restore the firewall use:
> 
> # iptables-restore < iptables.bak
> 
> You may have to use...
> 
> # systemctl stop firewalld
> # systemctl start firewalld
> 
> If you use firewalld.

We’re using shorewall …


There is an srun process listening on the login node:

srun    8500 vsc40075   13u  IPv4 597473      0t0     TCP *:36506 (LISTEN)


And slurmd on the worker node is trying to connect to it

[2018-03-09T22:00:44.908] [47.0] debug4: adding IO connection (logical node rank 0)
[2018-03-09T22:00:44.908] [47.0] debug4: connecting IO back to 10.141.21.202:36506
[2018-03-09T22:00:44.908] [47.0] debug:  _oom_event_monitor: started.
[2018-03-09T22:00:44.908] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.908] [47.0] debug3: Error connecting, picking new stream port
[2018-03-09T22:00:44.909] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.909] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.909] [47.0] debug2: slurm_connect failed: Connection refused
[2018-03-09T22:00:44.909] [47.0] debug2: Error connecting slurm stream socket at 10.141.21.202:36506: Connection refused
[2018-03-09T22:00:44.909] [47.0] error: connect io: Connection refused


Opening ports 30000-50000 seems to do the trick. Will try to figure out what’s different on the other machines.

Thanks for the pointers and help!

— Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180309/142e8d92/attachment.sig>


More information about the slurm-users mailing list