[slurm-users] srun --x11 connection rejected because of wrong authentication
Hadrian Djohari
hxd58 at case.edu
Thu Jun 7 19:48:16 MDT 2018
Hi,
I do not remember whether we had the same error message.
But, if the user's known_host has an old entry of the node he is trying to
connect, the x11 won't connect properly.
Once the known_host entry has been deleted, the x11 connects just fine.
Hadrian
On Thu, Jun 7, 2018 at 6:26 PM, Christopher Benjamin Coffey <
Chris.Coffey at nau.edu> wrote:
> Hi,
>
> I've compiled slurm 17.11.7 with x11 support. We can ssh to a node from
> the login node and get xeyes to work, etc. However, srun --x11 xeyes
> results in:
>
> [cbc at wind ~ ]$ srun --x11 --reservation=root_58 xeyes
> X11 connection rejected because of wrong authentication.
> Error: Can't open display: localhost:60.0
> srun: error: cn100: task 0: Exited with exit code 1
>
> On the node in slurmd.log it says:
>
> [2018-06-07T15:04:29.932] _run_prolog: run job script took usec=1
> [2018-06-07T15:04:29.932] _run_prolog: prolog with lock for job 11806306
> ran for 0 seconds
> [2018-06-07T15:04:29.957] [11806306.extern] task/cgroup:
> /slurm/uid_3301/job_11806306: alloc=1000MB mem.limit=1000MB
> memsw.limit=1000MB
> [2018-06-07T15:04:29.957] [11806306.extern] task/cgroup:
> /slurm/uid_3301/job_11806306/step_extern: alloc=1000MB mem.limit=1000MB
> memsw.limit=1000MB
> [2018-06-07T15:04:30.138] [11806306.extern] X11 forwarding established on
> DISPLAY=cn100:60.0
> [2018-06-07T15:04:30.239] launch task 11806306.0 request from
> 3301.3302 at 172.16.3.21 (port 32453)
> [2018-06-07T15:04:30.240] lllp_distribution jobid [11806306] implicit auto
> binding: cores,one_thread, dist 1
> [2018-06-07T15:04:30.240] _task_layout_lllp_cyclic
> [2018-06-07T15:04:30.240] _lllp_generate_cpu_bind jobid [11806306]:
> mask_cpu,one_thread, 0x0000001
> [2018-06-07T15:04:30.268] [11806306.0] task/cgroup:
> /slurm/uid_3301/job_11806306: alloc=1000MB mem.limit=1000MB
> memsw.limit=1000MB
> [2018-06-07T15:04:30.268] [11806306.0] task/cgroup:
> /slurm/uid_3301/job_11806306/step_0: alloc=1000MB mem.limit=1000MB
> memsw.limit=1000MB
> [2018-06-07T15:04:30.303] [11806306.0] task_p_pre_launch: Using
> sched_affinity for tasks
> [2018-06-07T15:04:30.310] [11806306.extern] error: _handle_channel: remote
> disconnected
> [2018-06-07T15:04:30.310] [11806306.extern] error: _handle_channel:
> exiting thread
> [2018-06-07T15:04:30.376] [11806306.0] done with job
> [2018-06-07T15:04:30.413] [11806306.extern] x11 forwarding shutdown
> complete
> [2018-06-07T15:04:30.443] [11806306.extern] _oom_event_monitor: oom-kill
> event count: 1
> [2018-06-07T15:04:30.508] [11806306.extern] done with job
>
> It seems like its close, as srun, and the node can agree on the port to
> connect on, but there is a difference between slurmd specifying the node
> name and port, where srun is trying to connect via localhost and the same
> port. Maybe I have an ssh setting wrong somewhere? I've tried all
> combinations I believe in ssh_config, and sshd_config. No issues with /home
> either, it’s a shared filesystem that each node mounts, and we even tried
> no_root_squash so root can write to the .Xauthority file like some folks
> have suggested.
>
> Also, xauth list shows that there was no magic cookie written for host
> cn100:
>
> [cbc at wind ~ ]$ xauth list
> wind.hpc.nau.edu/unix:14 MIT-MAGIC-COOKIE-1
> ac4a0f1bfe9589806f81dd45306ee33d
>
> Something preventing root from writing the magic cookie? The file is
> definitely writeable:
>
> [root at cn100 ~]# touch /home/cbc/.Xauthority
> [root at cn100 ~]#
>
> Anyone have any ideas? Thanks!
>
> Best,
> Chris
>
> —
> Christopher Coffey
> High-Performance Computing
> Northern Arizona University
> 928-523-1167
>
>
>
--
Hadrian Djohari
Manager of Research Computing Services, [U]Tech
Case Western Reserve University
(W): 216-368-0395
(M): 216-798-7490
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180607/8cdedba3/attachment-0001.html>
More information about the slurm-users
mailing list