[slurm-users] Internet connection loss with srun to a node

Brian Andrus toomuchit at gmail.com
Sun Aug 2 21:13:08 UTC 2020


This is very likely by design of the cluster and/or network. Otherwise 
users could use the cluster to mine bitcoin and such.

Brian Andrus

On 8/2/2020 7:11 AM, Mahmood Naderan wrote:
> I thought that maybe srun doesn't transfer all settings from the head 
> node to the compute node.
> The wget command works on frontend but doesn't work on the compute.
>
> mahmood at main-proxy:~$ wget google.com <http://google.com>
> --2020-08-02 16:05:55-- http://google.com/
> Resolving google.com <http://google.com> (google.com 
> <http://google.com>)... 216.58.215.238, 2a00:1450:400a:800::200e
> Connecting to google.com <http://google.com> (google.com 
> <http://google.com>)|216.58.215.238|:80... connected.
> HTTP request sent, awaiting response... 301 Moved Permanently
> Location: http://www.google.com/ [following]
> --2020-08-02 16:05:55-- http://www.google.com/
> Resolving www.google.com <http://www.google.com> (www.google.com 
> <http://www.google.com>)... 172.217.168.68, 2a00:1450:400a:803::2004
> Connecting to www.google.com <http://www.google.com> (www.google.com 
> <http://www.google.com>)|172.217.168.68|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: unspecified [text/html]
> Saving to: ‘index.html’
>
> index.html                         [ <=>                               
> ]  12.68K  --.-KB/s    in 0s
>
> 2020-08-02 16:05:56 (196 MB/s) - ‘index.html’ saved [12983]
>
> mahmood at main-proxy:~$ srun -p gpu_part --gres=gpu:titanv:1  --pty 
> /bin/bash
> mahmood at fry0:~$ wget google.com <http://google.com>
> --2020-08-02 16:05:30-- http://google.com/
> Resolving google.com <http://google.com> (google.com 
> <http://google.com>)... 216.58.215.238, 2a00:1450:400a:800::200e
> Connecting to google.com <http://google.com> (google.com 
> <http://google.com>)|216.58.215.238|:80... ^C
> mahmood at fry0:~$
>
>
>
>
> I will check the gateway with the admin.
> Thanks for the hint.
>
>
>
> Regards,
> Mahmood
>
>
>
>
> On Sun, Aug 2, 2020 at 5:58 PM Renfro, Michael <Renfro at tntech.edu 
> <mailto:Renfro at tntech.edu>> wrote:
>
>     Probably unrelated to slurm entirely, and most likely has to do
>     with lower-level network diagnostics. I can guarantee that it’s
>     possible to access Internet resources from a compute node. Notes
>     and things to check:
>
>     1. Both ping and http/https are IP protocols, but are very
>     different (ping isn’t even TCP or UDP, it’s ICMP), so even if you
>     needed proxy variables for http and https to work, they shouldn’t
>     affect ping.
>
>     2. Do http or https transfers work from a compute node? A github
>     clone, a test with curl or wget to a nearby web server? Do your
>     proxy variables exist on the compute node, and most importantly,
>     is there a proxy server listening and functional on the host and
>     port that the variables point to?
>
>     3. What’s the default gateway for your compute nodes? Does that
>     gateway provide network address translation (NAT) for the nodes,
>     or does it work as a traditional router?
>
>     Get Outlook for iOS <https://aka.ms/o0ukef>
>     ------------------------------------------------------------------------
>     *From:* slurm-users <slurm-users-bounces at lists.schedmd.com
>     <mailto:slurm-users-bounces at lists.schedmd.com>> on behalf of
>     Mahmood Naderan <mahmood.nt at gmail.com <mailto:mahmood.nt at gmail.com>>
>     *Sent:* Sunday, August 2, 2020 7:52:52 AM
>     *To:* Slurm User Community List <slurm-users at lists.schedmd.com
>     <mailto:slurm-users at lists.schedmd.com>>
>     *Subject:* [slurm-users] Internet connection loss with srun to a node
>     Hi
>     A frontend machine is connected to the internet and from that
>     machine, I use srun to get a bash on another node. But it seems
>     that the node is unable to access the internet. The http_proxy and
>     https_proxy are defined in ~/.bashrc
>
>     mahmood at main-proxy:~$ ping google.com <http://google.com>
>     PING google.com <http://google.com> (216.58.215.238) 56(84) bytes
>     of data.
>     64 bytes from zrh11s02-in-f14.1e100.net
>     <http://zrh11s02-in-f14.1e100.net> (216.58.215.238): icmp_seq=1
>     ttl=114 time=1.38 ms
>     ^C
>     --- google.com <http://google.com> ping statistics ---
>     1 packets transmitted, 1 received, 0% packet loss, time 0ms
>     rtt min/avg/max/mdev = 1.384/1.384/1.384/0.000 ms
>     mahmood at main-proxy:~$ srun -p gpu_part --gres=gpu:titanv:1  --pty
>     /bin/bash
>     mahmood @fry0:~$ ping google.com <http://google.com>
>     PING google.com <http://google.com> (216.58.215.238) 56(84) bytes
>     of data.
>     ^C
>     --- google.com <http://google.com> ping statistics ---
>     3 packets transmitted, 0 received, 100% packet loss, time 2026ms
>
>
>
>     I guess that is related to slurm and srun.
>     Any idea for that?
>
>
>
>
>
>
>
>     Regards,
>     Mahmood
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200802/c7d73098/attachment.htm>


More information about the slurm-users mailing list