<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">Hi Russell Jones, <br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">did you try to stop firewall on the
client cluster-cn02 ?</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Patrick<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Le 16/11/2020 à 19:20, Russell Jones a
écrit :<br>
</div>
<blockquote type="cite"
cite="mid:CABb1d=iyAeSXY5nr3AMQtNf=d_baCPiCZG8SfBN5V2rTO6UWxw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Here's some debug logs from the compute node after
launching an interactive shell with the x11 flag. I see it show
X11 forwarding established, then it ends with a connection
timeout.
<div><br>
</div>
<div><br>
</div>
<div>[2020-11-16T12:12:34.097] debug: Checking credential with
492 bytes of sig data<br>
[2020-11-16T12:12:34.098] _run_prolog: run job script took
usec=1284<br>
[2020-11-16T12:12:34.098] _run_prolog: prolog with lock for
job 30873 ran for 0 seconds<br>
[2020-11-16T12:12:34.111] debug: AcctGatherEnergy NONE plugin
loaded<br>
[2020-11-16T12:12:34.112] debug: AcctGatherProfile NONE
plugin loaded<br>
[2020-11-16T12:12:34.113] debug: AcctGatherInterconnect NONE
plugin loaded<br>
[2020-11-16T12:12:34.114] debug: AcctGatherFilesystem NONE
plugin loaded<br>
[2020-11-16T12:12:34.115] debug: switch NONE plugin loaded<br>
[2020-11-16T12:12:34.116] debug: init: Gres GPU plugin loaded<br>
[2020-11-16T12:12:34.116] [30873.extern] debug: Job
accounting gather LINUX plugin loaded<br>
[2020-11-16T12:12:34.117] [30873.extern] debug: cont_id
hasn't been set yet not running poll<br>
[2020-11-16T12:12:34.117] [30873.extern] debug: Message
thread started pid = 18771<br>
[2020-11-16T12:12:34.119] [30873.extern] debug: task NONE
plugin loaded<br>
[2020-11-16T12:12:34.120] [30873.extern] Munge credential
signature plugin loaded<br>
[2020-11-16T12:12:34.121] [30873.extern] debug: job_container
none plugin loaded<br>
[2020-11-16T12:12:34.121] [30873.extern] debug: spank:
opening plugin stack
/apps/slurm/cluster/20.02.0/etc/plugstack.conf<br>
[2020-11-16T12:12:34.121] [30873.extern] debug:
X11Parameters: (null)<br>
[2020-11-16T12:12:34.133] [30873.extern] X11 forwarding
established on DISPLAY=cluster-cn02.domain:66.0<br>
[2020-11-16T12:12:34.133] [30873.extern] debug:
jag_common_poll_data: Task 0 pid 18775 ave_freq = 4023000 mem
size/max 6750208/6750208 vmem size/max 147521536/147521536,
disk read size/max (7200/7200), disk write size/max (374/374),
time 0.000000(0+0) Energy tot/max 0/0 TotPower 0 MaxPower 0
MinPower 0<br>
[2020-11-16T12:12:34.133] [30873.extern] debug: x11
forwarding local display is 66<br>
[2020-11-16T12:12:34.133] [30873.extern] debug: x11
forwarding local xauthority is /tmp/.Xauthority-MkU8aA<br>
[2020-11-16T12:12:34.202] launch task 30873.0 request from
UID:1368 GID:512 HOST:172.21.150.10 PORT:4795<br>
[2020-11-16T12:12:34.202] debug: Checking credential with 492
bytes of sig data<br>
[2020-11-16T12:12:34.202] [30873.extern] debug: Handling
REQUEST_X11_DISPLAY<br>
[2020-11-16T12:12:34.202] [30873.extern] debug: Leaving
_handle_get_x11_display<br>
[2020-11-16T12:12:34.202] debug: Leaving
stepd_get_x11_display<br>
[2020-11-16T12:12:34.202] debug: Waiting for job 30873's
prolog to complete<br>
[2020-11-16T12:12:34.202] debug: Finished wait for job
30873's prolog to complete<br>
[2020-11-16T12:12:34.213] debug: AcctGatherEnergy NONE plugin
loaded<br>
[2020-11-16T12:12:34.214] debug: AcctGatherProfile NONE
plugin loaded<br>
[2020-11-16T12:12:34.214] debug: AcctGatherInterconnect NONE
plugin loaded<br>
[2020-11-16T12:12:34.214] debug: AcctGatherFilesystem NONE
plugin loaded<br>
[2020-11-16T12:12:34.215] debug: switch NONE plugin loaded<br>
[2020-11-16T12:12:34.215] debug: init: Gres GPU plugin loaded<br>
[2020-11-16T12:12:34.216] [30873.0] debug: Job accounting
gather LINUX plugin loaded<br>
[2020-11-16T12:12:34.216] [30873.0] debug: cont_id hasn't
been set yet not running poll<br>
[2020-11-16T12:12:34.216] [30873.0] debug: Message thread
started pid = 18781<br>
[2020-11-16T12:12:34.216] debug:
task_p_slurmd_reserve_resources: 30873 0<br>
[2020-11-16T12:12:34.217] [30873.0] debug: task NONE plugin
loaded<br>
[2020-11-16T12:12:34.217] [30873.0] Munge credential signature
plugin loaded<br>
[2020-11-16T12:12:34.217] [30873.0] debug: job_container none
plugin loaded<br>
[2020-11-16T12:12:34.217] [30873.0] debug: mpi type = pmix<br>
[2020-11-16T12:12:34.244] [30873.0] debug: spank: opening
plugin stack /apps/slurm/cluster/20.02.0/etc/plugstack.conf<br>
[2020-11-16T12:12:34.244] [30873.0] debug: mpi type = pmix<br>
[2020-11-16T12:12:34.244] [30873.0] debug: (null) [0]
mpi_pmix.c:153 [p_mpi_hook_slurmstepd_prefork] mpi/pmix: start<br>
[2020-11-16T12:12:34.244] [30873.0] debug: mpi/pmix: setup
sockets<br>
[2020-11-16T12:12:34.273] [30873.0] debug: cluster-cn02 [0]
pmixp_client_v2.c:69 [_errhandler_reg_callbk] mpi/pmix: Error
handler registration callback is called with status=0, ref=0<br>
[2020-11-16T12:12:34.273] [30873.0] debug: cluster-cn02 [0]
pmixp_client.c:697 [pmixp_libpmix_job_set] mpi/pmix: task
initialization<br>
[2020-11-16T12:12:34.273] [30873.0] debug: cluster-cn02 [0]
pmixp_agent.c:229 [_agent_thread] mpi/pmix: Start agent thread<br>
[2020-11-16T12:12:34.273] [30873.0] debug: cluster-cn02 [0]
pmixp_agent.c:330 [pmixp_agent_start] mpi/pmix: agent thread
started: tid = 70366934331824<br>
[2020-11-16T12:12:34.273] [30873.0] debug: cluster-cn02 [0]
pmixp_agent.c:335 [pmixp_agent_start] mpi/pmix: timer thread
started: tid = 70366933283248<br>
[2020-11-16T12:12:34.273] [30873.0] debug: cluster-cn02 [0]
pmixp_agent.c:267 [_pmix_timer_thread] mpi/pmix: Start timer
thread<br>
[2020-11-16T12:12:34.273] [30873.0] debug: stdin uses a pty
object<br>
[2020-11-16T12:12:34.274] [30873.0] debug: init pty size
34:159<br>
[2020-11-16T12:12:34.274] [30873.0] in _window_manager<br>
[2020-11-16T12:12:34.274] [30873.0] debug level = 2<br>
[2020-11-16T12:12:34.274] [30873.0] debug: IO handler started
pid=18781<br>
[2020-11-16T12:12:34.275] [30873.0] starting 1 tasks<br>
[2020-11-16T12:12:34.276] [30873.0] task 0 (18801) started
2020-11-16T12:12:34<br>
[2020-11-16T12:12:34.276] [30873.0] debug:
task_p_pre_launch_priv: 30873.0<br>
[2020-11-16T12:12:34.288] [30873.0] debug:
jag_common_poll_data: Task 0 pid 18801 ave_freq = 4023000 mem
size/max 9961472/9961472 vmem size/max 512557056/512557056,
disk read size/max (0/0), disk write size/max (0/0), time
0.000000(0+0) Energy tot/max 0/0 TotPower 0 MaxPower 0
MinPower 0<br>
[2020-11-16T12:12:34.288] [30873.0] debug: Sending launch
resp rc=0<br>
[2020-11-16T12:12:34.288] [30873.0] debug: mpi type = pmix<br>
[2020-11-16T12:12:34.288] [30873.0] debug: cluster-cn02 [0]
mpi_pmix.c:180 [p_mpi_hook_slurmstepd_task] mpi/pmix: Patch
environment for task 0<br>
[2020-11-16T12:12:34.289] [30873.0] debug: task_p_pre_launch:
30873.0, task 0<br>
[2020-11-16T12:12:39.475] [30873.extern] error:
_x11_socket_read: slurm_open_msg_conn: Connection timed out<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Nov 16, 2020 at 11:50
AM Russell Jones <<a href="mailto:arjones85@gmail.com"
moz-do-not-send="true">arjones85@gmail.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Hello,
<div><br>
</div>
<div>Thanks for the reply! </div>
<div><br>
</div>
<div>We are using Slurm 20.02.0.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Nov 16, 2020 at
10:59 AM sathish <<a
href="mailto:sathish.sathishkumar@gmail.com"
target="_blank" moz-do-not-send="true">sathish.sathishkumar@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div dir="ltr">Hi Russell Jones,
<div><span
style="font-size:14px;letter-spacing:0.2px;font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif"><br>
</span></div>
<div><span
style="font-size:14px;letter-spacing:0.2px;font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif">I
</span>believe<span
style="font-size:14px;letter-spacing:0.2px;font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif"> you
are using a slurm version older than 19.05. X11
forwarding code has been revamped and it works as
expected starting from the 19.05.0 version. </span><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Nov 16, 2020
at 10:02 PM Russell Jones <<a
href="mailto:arjones85@gmail.com" target="_blank"
moz-do-not-send="true">arjones85@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Hi all,
<div><br>
</div>
<div>Hoping I can get pointed in the right
direction here. <br>
<br>
I have X11 forwarding enabled in Slurm, however
I cannot seem to get it working properly. It
works when I test with "ssh -Y" to the compute
node from the login node, however when I try
through Slurm the Display variable looks very
different, and I get an error. Example below:</div>
<div><br>
</div>
<div>[user@cluster-1 ~]$ ssh -Y cluster-cn02<br>
Last login: Mon Nov 16 10:09:18 2020 from
172.21.150.10<br>
-bash-4.2$ env | grep -i display<br>
DISPLAY=172.21.150.102:10.0<br>
-bash-4.2$ xclock<br>
Warning: Missing charsets in String to FontSet
conversion<br>
** Clock pops up and works **<br>
<br>
[user@cluster-1 ~]$ srun -p cluster -w
cluster-cn02 --x11 --pty bash -l<br>
bash-4.2$ env | grep -i display<br>
DISPLAY=localhost:28.0<br>
bash-4.2$ xclock<br>
Error: Can't open display: localhost:28.0<br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>Any ideas on where to begin looking? I'm not
sure why the display variable is being set to
localhost instead of the login node.</div>
<div><br>
</div>
<div>Thanks!</div>
<div><br>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">Regards.....<br>
Sathish</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>