On Wednesday, 23 July 2025 12:19:42 CEST Patryk Bełzak via slurm-users wrote:
Hi,
we've recentry upgraded our slurm from 24.11.3 to 25.05.1 and it seems that since the upgrade the ssh X11 forwaring is broken.
Quick recap -
- on Monday 14'th I performed slurdbd and slurmctld upgrades - X forwarding
was still working * on Tuesday 15'th I performed slurmd upgrades - X forwarding stopped working
The issue is very hard to determine and it looks like it sits somhere in slurm code. You can submit a job with --x11 and it starts corretly. Xauthority is created, you have all the magic cookies needed, but when you try to start any application, you get error related to permissions I guess, see for yourself:
me@sand ~ ssh -X -Y ui [wcss] me@ui.wcss.pl:~ > srun -p lem-cpu-short -A kdm-staff --gres=storage:local:50G -c 12 --mem 12G -t 1:0:0 --x11 --pty /bin/bash [wcss] me@r17ch05b01 ~ > xauth list r17ch05b01.lem.kdm.wcss.pl/unix:91 MIT-MAGIC-COOKIE-1 d82a2efd [wcss] me@r17ch05b01 ~ > xterm xterm: Xt error: Can't open display: localhost:91.0 [wcss] me@r17ch05b01 ~ > date && telnet -4 localhost 6091 || date Wed Jul 23 12:02:39 CEST 2025 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection closed by foreign host. Wed Jul 23 12:02:41 CEST 2025
As you can see the connection to port is being dropped/killed after a second or two. Now, it doesn't really matter which flags for ssh you pick (-X or -Y or both). X forwarding is working when you log in as a regular user outside of slurm job. Also if I do ssh localhost inside a job, then I can perform connection to port assigned to $DISPLAY and it isn't dropped - but it doesn't work since $DISPLAY and cookies are being messed up when you perform triple jump and one within same host.
Our worker nodes are mostly on el9.5 AlmaLinux. Some are on el8.10 - and there acutally you can do some X forwarding but you must use both -X and -Y (which wasn't the case before slurm upgrade). TLS is disabled in slurm.conf. I am 100% sure that both SSHD and Xorg are properly configured.
Has anyone encountered similiar issue? Or any comment from slurm dev team?
Best regards Patryk -- Wroclaw Centre for Networking and Supercomputing
I did create a bug report:
https://support.schedmd.com/show_bug.cgi?id=23190
I got the following response per email: Currently, this bug is showing as unsupported in our system. Unsupported bugs are given a very low priority and most times the unsupported bugs are never reviewed by the support team as their focus is on sites with support contracts.
If you have a support contract you might rice the priority.
regards Markus Köberl