[slurm-users] Internet connection loss with srun to a node
Brian Andrus
toomuchit at gmail.com
Sun Aug 2 21:13:08 UTC 2020
This is very likely by design of the cluster and/or network. Otherwise
users could use the cluster to mine bitcoin and such.
Brian Andrus
On 8/2/2020 7:11 AM, Mahmood Naderan wrote:
> I thought that maybe srun doesn't transfer all settings from the head
> node to the compute node.
> The wget command works on frontend but doesn't work on the compute.
>
> mahmood at main-proxy:~$ wget google.com <http://google.com>
> --2020-08-02 16:05:55-- http://google.com/
> Resolving google.com <http://google.com> (google.com
> <http://google.com>)... 216.58.215.238, 2a00:1450:400a:800::200e
> Connecting to google.com <http://google.com> (google.com
> <http://google.com>)|216.58.215.238|:80... connected.
> HTTP request sent, awaiting response... 301 Moved Permanently
> Location: http://www.google.com/ [following]
> --2020-08-02 16:05:55-- http://www.google.com/
> Resolving www.google.com <http://www.google.com> (www.google.com
> <http://www.google.com>)... 172.217.168.68, 2a00:1450:400a:803::2004
> Connecting to www.google.com <http://www.google.com> (www.google.com
> <http://www.google.com>)|172.217.168.68|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: unspecified [text/html]
> Saving to: ‘index.html’
>
> index.html [ <=>
> ] 12.68K --.-KB/s in 0s
>
> 2020-08-02 16:05:56 (196 MB/s) - ‘index.html’ saved [12983]
>
> mahmood at main-proxy:~$ srun -p gpu_part --gres=gpu:titanv:1 --pty
> /bin/bash
> mahmood at fry0:~$ wget google.com <http://google.com>
> --2020-08-02 16:05:30-- http://google.com/
> Resolving google.com <http://google.com> (google.com
> <http://google.com>)... 216.58.215.238, 2a00:1450:400a:800::200e
> Connecting to google.com <http://google.com> (google.com
> <http://google.com>)|216.58.215.238|:80... ^C
> mahmood at fry0:~$
>
>
>
>
> I will check the gateway with the admin.
> Thanks for the hint.
>
>
>
> Regards,
> Mahmood
>
>
>
>
> On Sun, Aug 2, 2020 at 5:58 PM Renfro, Michael <Renfro at tntech.edu
> <mailto:Renfro at tntech.edu>> wrote:
>
> Probably unrelated to slurm entirely, and most likely has to do
> with lower-level network diagnostics. I can guarantee that it’s
> possible to access Internet resources from a compute node. Notes
> and things to check:
>
> 1. Both ping and http/https are IP protocols, but are very
> different (ping isn’t even TCP or UDP, it’s ICMP), so even if you
> needed proxy variables for http and https to work, they shouldn’t
> affect ping.
>
> 2. Do http or https transfers work from a compute node? A github
> clone, a test with curl or wget to a nearby web server? Do your
> proxy variables exist on the compute node, and most importantly,
> is there a proxy server listening and functional on the host and
> port that the variables point to?
>
> 3. What’s the default gateway for your compute nodes? Does that
> gateway provide network address translation (NAT) for the nodes,
> or does it work as a traditional router?
>
> Get Outlook for iOS <https://aka.ms/o0ukef>
> ------------------------------------------------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com
> <mailto:slurm-users-bounces at lists.schedmd.com>> on behalf of
> Mahmood Naderan <mahmood.nt at gmail.com <mailto:mahmood.nt at gmail.com>>
> *Sent:* Sunday, August 2, 2020 7:52:52 AM
> *To:* Slurm User Community List <slurm-users at lists.schedmd.com
> <mailto:slurm-users at lists.schedmd.com>>
> *Subject:* [slurm-users] Internet connection loss with srun to a node
> Hi
> A frontend machine is connected to the internet and from that
> machine, I use srun to get a bash on another node. But it seems
> that the node is unable to access the internet. The http_proxy and
> https_proxy are defined in ~/.bashrc
>
> mahmood at main-proxy:~$ ping google.com <http://google.com>
> PING google.com <http://google.com> (216.58.215.238) 56(84) bytes
> of data.
> 64 bytes from zrh11s02-in-f14.1e100.net
> <http://zrh11s02-in-f14.1e100.net> (216.58.215.238): icmp_seq=1
> ttl=114 time=1.38 ms
> ^C
> --- google.com <http://google.com> ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 1.384/1.384/1.384/0.000 ms
> mahmood at main-proxy:~$ srun -p gpu_part --gres=gpu:titanv:1 --pty
> /bin/bash
> mahmood @fry0:~$ ping google.com <http://google.com>
> PING google.com <http://google.com> (216.58.215.238) 56(84) bytes
> of data.
> ^C
> --- google.com <http://google.com> ping statistics ---
> 3 packets transmitted, 0 received, 100% packet loss, time 2026ms
>
>
>
> I guess that is related to slurm and srun.
> Any idea for that?
>
>
>
>
>
>
>
> Regards,
> Mahmood
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200802/c7d73098/attachment.htm>
More information about the slurm-users
mailing list