[slurm-users] X11 forwarding, slurm-22.05.3, hostbased auth

Davide DelVento davide.quantum at gmail.com
Thu Oct 6 19:09:00 UTC 2022


Perhaps just a very trivial question, but it doesn't look you
mentioned it: does your X-forwarding work from the login node? Maybe
the X-server on your client is the problem and trying xclock on the
login node would clarify that

On Wed, Oct 5, 2022 at 12:03 PM Allan Streib <astreib at indiana.edu> wrote:
>
> Hi everyone,
>
> I'm trying to get X11 forwarding working on my cluster. I've read some
> of the threads and web posts on X11 forwarding and most of the common
> issues I'm finding seem to pertain to older versions of Slurm.
>
> I log in from my workstation to the login node with ssh -X. I have x11
> apps installed on a test compute node, j-096. Here is what I see:
>
> From the config.log when I built slurm:
>
>     $ grep X11 config.log
>     configure:19906: checking whether Slurm internal X11 support is enabled
>     | #define WITH_SLURM_X11 1
>     | #define WITH_SLURM_X11 1
>     | #define WITH_SLURM_X11 1
>     #define WITH_SLURM_X11 1
>
>
> From the login node:
>
>     $ scontrol show config | grep X11
>     PrologFlags             = Alloc,Contain,X11
>     X11Parameters           = home_xauthority
>
>     $ grep ^X11 /etc/ssh/sshd_config
>     X11Forwarding yes
>     X11UseLocalhost no
>
>
> Here is what I see when I try to run "xclock" on my test node:
>
>     $ srun --x11 -w j-096 xclock
>     Error: Can't open display: localhost:64.0
>     srun: error: j-096: task 0: Exited with exit code 1
>
>
> From the sshd_config on the test node:
>
>     $ grep ^X11 /etc/ssh/sshd_config
>     X11Forwarding yes
>
> We are using hostbased ssh authentication in this cluster.
>
> From the slurmd.log on the test node:
>
>     [2022-10-05T13:29:51.065] [2822.extern] X11 forwarding established on DISPLAY=j-096:64.0
>     [2022-10-05T13:29:51.165] launch task StepId=2822.0 request from UID:8348 GID:100 HOST:172.16.100.132 PORT:58948
>     [2022-10-05T13:29:51.165] task/affinity: lllp_distribution: JobId=2822 auto binding off: mask_cpu
>     [2022-10-05T13:29:51.311] [2822.extern] error: _x11_socket_read: slurm_open_msg_conn(127.0.0.1:34811): Connection refused
>     [2022-10-05T13:29:51.330] [2822.0] done with job
>     [2022-10-05T13:29:51.346] [2822.extern] done with job
>     [2022-10-05T13:29:51.436] [2822.extern] x11 forwarding shutdown complete
>
> Is the issue the two different DISPLAY values, i.e. j-096:64.0
> vs. localhost:64.0. Not sure how/where to reconcile these? I have tried
> with and without "X11UseLocalhost no" on the login node.
>
> Best wishes,
>
> Allan
>



More information about the slurm-users mailing list