[slurm-users] Internet connection loss with srun to a node
Renfro, Michael
Renfro at tntech.edu
Sun Aug 2 13:26:20 UTC 2020
Probably unrelated to slurm entirely, and most likely has to do with lower-level network diagnostics. I can guarantee that it’s possible to access Internet resources from a compute node. Notes and things to check:
1. Both ping and http/https are IP protocols, but are very different (ping isn’t even TCP or UDP, it’s ICMP), so even if you needed proxy variables for http and https to work, they shouldn’t affect ping.
2. Do http or https transfers work from a compute node? A github clone, a test with curl or wget to a nearby web server? Do your proxy variables exist on the compute node, and most importantly, is there a proxy server listening and functional on the host and port that the variables point to?
3. What’s the default gateway for your compute nodes? Does that gateway provide network address translation (NAT) for the nodes, or does it work as a traditional router?
Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Mahmood Naderan <mahmood.nt at gmail.com>
Sent: Sunday, August 2, 2020 7:52:52 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [slurm-users] Internet connection loss with srun to a node
Hi
A frontend machine is connected to the internet and from that machine, I use srun to get a bash on another node. But it seems that the node is unable to access the internet. The http_proxy and https_proxy are defined in ~/.bashrc
mahmood at main-proxy:~$ ping google.com<http://google.com>
PING google.com<http://google.com> (216.58.215.238) 56(84) bytes of data.
64 bytes from zrh11s02-in-f14.1e100.net<http://zrh11s02-in-f14.1e100.net> (216.58.215.238): icmp_seq=1 ttl=114 time=1.38 ms
^C
--- google.com<http://google.com> ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.384/1.384/1.384/0.000 ms
mahmood at main-proxy:~$ srun -p gpu_part --gres=gpu:titanv:1 --pty /bin/bash
mahmood @fry0:~$ ping google.com<http://google.com>
PING google.com<http://google.com> (216.58.215.238) 56(84) bytes of data.
^C
--- google.com<http://google.com> ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2026ms
I guess that is related to slurm and srun.
Any idea for that?
Regards,
Mahmood
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200802/32dcfb9c/attachment.htm>
More information about the slurm-users
mailing list