[slurm-users] X11 forwarding, slurm-22.05.3, hostbased auth
Davide DelVento
davide.quantum at gmail.com
Thu Oct 6 19:09:00 UTC 2022
Perhaps just a very trivial question, but it doesn't look you
mentioned it: does your X-forwarding work from the login node? Maybe
the X-server on your client is the problem and trying xclock on the
login node would clarify that
On Wed, Oct 5, 2022 at 12:03 PM Allan Streib <astreib at indiana.edu> wrote:
>
> Hi everyone,
>
> I'm trying to get X11 forwarding working on my cluster. I've read some
> of the threads and web posts on X11 forwarding and most of the common
> issues I'm finding seem to pertain to older versions of Slurm.
>
> I log in from my workstation to the login node with ssh -X. I have x11
> apps installed on a test compute node, j-096. Here is what I see:
>
> From the config.log when I built slurm:
>
> $ grep X11 config.log
> configure:19906: checking whether Slurm internal X11 support is enabled
> | #define WITH_SLURM_X11 1
> | #define WITH_SLURM_X11 1
> | #define WITH_SLURM_X11 1
> #define WITH_SLURM_X11 1
>
>
> From the login node:
>
> $ scontrol show config | grep X11
> PrologFlags = Alloc,Contain,X11
> X11Parameters = home_xauthority
>
> $ grep ^X11 /etc/ssh/sshd_config
> X11Forwarding yes
> X11UseLocalhost no
>
>
> Here is what I see when I try to run "xclock" on my test node:
>
> $ srun --x11 -w j-096 xclock
> Error: Can't open display: localhost:64.0
> srun: error: j-096: task 0: Exited with exit code 1
>
>
> From the sshd_config on the test node:
>
> $ grep ^X11 /etc/ssh/sshd_config
> X11Forwarding yes
>
> We are using hostbased ssh authentication in this cluster.
>
> From the slurmd.log on the test node:
>
> [2022-10-05T13:29:51.065] [2822.extern] X11 forwarding established on DISPLAY=j-096:64.0
> [2022-10-05T13:29:51.165] launch task StepId=2822.0 request from UID:8348 GID:100 HOST:172.16.100.132 PORT:58948
> [2022-10-05T13:29:51.165] task/affinity: lllp_distribution: JobId=2822 auto binding off: mask_cpu
> [2022-10-05T13:29:51.311] [2822.extern] error: _x11_socket_read: slurm_open_msg_conn(127.0.0.1:34811): Connection refused
> [2022-10-05T13:29:51.330] [2822.0] done with job
> [2022-10-05T13:29:51.346] [2822.extern] done with job
> [2022-10-05T13:29:51.436] [2822.extern] x11 forwarding shutdown complete
>
> Is the issue the two different DISPLAY values, i.e. j-096:64.0
> vs. localhost:64.0. Not sure how/where to reconcile these? I have tried
> with and without "X11UseLocalhost no" on the login node.
>
> Best wishes,
>
> Allan
>
More information about the slurm-users
mailing list