[slurm-users] X11 forwarding, slurm-22.05.3, hostbased auth

Allan Streib astreib at indiana.edu
Wed Oct 5 18:00:24 UTC 2022


Hi everyone,

I'm trying to get X11 forwarding working on my cluster. I've read some
of the threads and web posts on X11 forwarding and most of the common
issues I'm finding seem to pertain to older versions of Slurm.

I log in from my workstation to the login node with ssh -X. I have x11
apps installed on a test compute node, j-096. Here is what I see:

>From the config.log when I built slurm:

    $ grep X11 config.log
    configure:19906: checking whether Slurm internal X11 support is enabled
    | #define WITH_SLURM_X11 1
    | #define WITH_SLURM_X11 1
    | #define WITH_SLURM_X11 1
    #define WITH_SLURM_X11 1


>From the login node:

    $ scontrol show config | grep X11
    PrologFlags             = Alloc,Contain,X11
    X11Parameters           = home_xauthority

    $ grep ^X11 /etc/ssh/sshd_config
    X11Forwarding yes
    X11UseLocalhost no


Here is what I see when I try to run "xclock" on my test node:

    $ srun --x11 -w j-096 xclock
    Error: Can't open display: localhost:64.0
    srun: error: j-096: task 0: Exited with exit code 1


>From the sshd_config on the test node:

    $ grep ^X11 /etc/ssh/sshd_config
    X11Forwarding yes

We are using hostbased ssh authentication in this cluster.

>From the slurmd.log on the test node:

    [2022-10-05T13:29:51.065] [2822.extern] X11 forwarding established on DISPLAY=j-096:64.0
    [2022-10-05T13:29:51.165] launch task StepId=2822.0 request from UID:8348 GID:100 HOST:172.16.100.132 PORT:58948
    [2022-10-05T13:29:51.165] task/affinity: lllp_distribution: JobId=2822 auto binding off: mask_cpu
    [2022-10-05T13:29:51.311] [2822.extern] error: _x11_socket_read: slurm_open_msg_conn(127.0.0.1:34811): Connection refused
    [2022-10-05T13:29:51.330] [2822.0] done with job
    [2022-10-05T13:29:51.346] [2822.extern] done with job
    [2022-10-05T13:29:51.436] [2822.extern] x11 forwarding shutdown complete

Is the issue the two different DISPLAY values, i.e. j-096:64.0
vs. localhost:64.0. Not sure how/where to reconcile these? I have tried
with and without "X11UseLocalhost no" on the login node.

Best wishes,

Allan



More information about the slurm-users mailing list