[slurm-users] Srun not setting DISPLAY with --x11 for one account

Jeffrey T Frey frey at udel.edu
Mon Jan 27 14:38:35 UTC 2020


The Slurm-native X11 plugin demands you use ~/.ssh/id_rsa{,.pub} keys.  It's hard-coded into the plugin:


/*
 * Ideally these would be selected at run time. Unfortunately,
 * only ssh-rsa and ssh-dss are supported by libssh2 at this time,
 * and ssh-dss is deprecated.
 */
static char *hostkey_priv = "/etc/ssh/ssh_host_rsa_key";
static char *hostkey_pub = "/etc/ssh/ssh_host_rsa_key.pub";
static char *priv_format = "%s/.ssh/id_rsa";
static char *pub_format = "%s/.ssh/id_rsa.pub";





> On Jan 27, 2020, at 09:34 , Simon Andrews <simon.andrews at babraham.ac.uk> wrote:
> 
> I’ve managed to track down the difference between the accounts which work and those which don’t – but I still don’t understand the mechanism.
>  
> The accounts which work all had their home directories used on an older system.  The ones which fail were only ever used on the new system.  The relevant difference seems to be the way their ssh keys are set up.  On the old system a standard ssh-keygen was run, creating ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub files and putting the pub file into authorized_keys.
>  
> On the new warewulf based system ssh-keygen was again run, but the default key file names was changed.  We now have ~/.ssh/cluster and ~/.ssh/cluster.pub and there is a ~/.ssh/config file which contains:
>  
> # Added by Warewulf  2019-12-10
> Host pebble*
>    IdentityFile ~/.ssh/cluster
>    StrictHostKeyChecking=no
>  
> This all works fine, and I can ssh from the head node to the ‘pebble’ compute nodes just fine, however something in the code for the slurm x11 forwarder is specifically looking for id_rsa files (or is ignoring the config file), since the forwarding fails if I don’t have these, and works as soon as I do.
>  
> Any ideas where this might be happening so I can either file a bug for change whatever setting this needs?
>  
> Simon.
>  
> From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of William Brown
> Sent: 24 January 2020 17:21
> To: Slurm User Community List <slurm-users at lists.schedmd.com>
> Subject: Re: [slurm-users] Srun not setting DISPLAY with --x11 for one account
>  
> There are differences for X11 between Slurm versions so it may help to know which version you have.
>  
> I tried some of your commands on our slurm 19.05.3-2 cluster, and interestingly on the session on the compute node I don't see the cookie for the login node:  This was with MobaXterm:
>  
> [user at prdubrvm005 ~]$ xauth list
> prdubrvm005.research.rcsi.com/unix:10  MIT-MAGIC-COOKIE-1  2efc5dd851736e3848193f65d038eca8
> [user at prdubrvm005 ~]$ srun --pty  --x11  --preserve-env /bin/bash
> [user at prdubrhpc1-02 ~]$ xauth list
> prdubrhpc1-02.research.rcsi.com/unix:95  MIT-MAGIC-COOKIE-1  2efc5dd851736e3848193f65d038eca8
> [user at prdubrhpc1-02 ~]$ echo $DISPLAY
> localhost:95.0
>  
> Any per-user problem would make me suspect the user having a different shell, or something in their login script.  Can you make their .bashrc and .bash_profile just exit?  Or look for hidden configuration files for <something> in their home directory?
>  
> William
>  
>  
>  
> On Fri, 24 Jan 2020 at 16:05, Simon Andrews <simon.andrews at babraham.ac.uk> wrote:
> I have a weird problem which I can’t get to the bottom of. 
>  
> We have a cluster which allows users to start interactive sessions which forward any X11 sessions they generated on the head node.  This generally works fine, but on the account of one user it doesn’t work.  The X11 connection to the head node is fine, but it won’t transfer to the compute node.
>  
> The symptoms are shown below:
>  
> A good user gets this:
>  
> [good at headnode ~]$ xauth list
> headnode.babraham.ac.uk/unix:12  MIT-MAGIC-COOKIE-1  f04a2bf9a921a3357e44373655add14a
>  
> [good at headnode ~]$ echo $DISPLAY
> localhost:12.0
>  
> [good at headnode ~]$ srun --pty -p interactive --x11  --preserve-env /bin/bash
>  
> [good at compute ~]$ xauth list
> headnode.babraham.ac.uk/unix:12  MIT-MAGIC-COOKIE-1  f04a2bf9a921a3357e44373655add14a
> compute/unix:25  MIT-MAGIC-COOKIE-1  f04a2bf9a921a3357e44373655add14a
>  
> [good at compute ~]$ echo $DISPLAY
> localhost:25.0
>  
> So the cookie is copied from the head node and forwarded and the DISPLAY variable is updated.
>  
> The bad user gets this:
>  
> [bad at headnode ~]$ xauth list
> headnode.babraham.ac.uk/unix:10  MIT-MAGIC-COOKIE-1  c39a493a37132d308b37469d363d8692
>  
> [bad at headnode ~]$ echo $DISPLAY
> localhost:10.0
>  
> [bad at headnode ~]$ srun --pty -p interactive --x11  --preserve-env /bin/bash
>  
> [bad at compute ~]$ xauth list
> headnode.babraham.ac.uk/unix:10  MIT-MAGIC-COOKIE-1  c39a493a37132d308b37469d363d8692
>  
> [bad at compute ~]$ echo $DISPLAY
> localhost:10.0
>  
> So the cookie isn’t copied and the DISPLAY isn’t updated.  I can’t see any errors in the logs and I can’t see anything different about this account.
>  
> If I do a straight forward ssh -Y from the head node to a compute node from the bad account then that works fine – it’s only whatever is specific about the way that srun forwards X which fails.
>  
> Any ideas or suggestions for debugging would be appreciated as I’m running out of things to try!
>  
> Simon.
> The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
> The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk
> The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
> The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200127/bc8fd33a/attachment.htm>


More information about the slurm-users mailing list