[slurm-users] Issue with x11

Marcus Wagner wagner at itc.rwth-aachen.de
Thu May 16 12:15:11 UTC 2019


Hi Alan,

we are also seeing this, but that has nothing to do with X11 support, 
since we compile atm. SLURM without X11 support.
We also see sometimes jobs running on, even if e.g. mpi rank one got 
killed by oom, rank zero is stuck in mpi_finalize.
SLURM seems to not detect everytimes, if oom killer was active, thus not 
terminating the rest of the mpi-processes.

Best Marcus

On 5/16/19 9:04 AM, Alan Orth wrote:
> Yes I'm also looking forward to SLURM 19.05. We have had lots of 
> issues with X11 since we upgraded to 18.08 and started using its 
> built-in X11 support. Part of this was resolved by setting 
> "X11Parameters=local_xauthority" in slurm.conf to reduce locking 
> contention on the Xauthority file, but now we get a handful of nodes 
> drained every day with reason "Kill task failed". In ten years of 
> using SLURM I've never had so many problems as I'm having now. :\
>
> Regards,
>
> On Wed, May 15, 2019 at 9:40 PM Christopher Samuel <chris at csamuel.org 
> <mailto:chris at csamuel.org>> wrote:
>
>     On 5/15/19 11:36 AM, Mahmood Naderan wrote:
>
>     > I really like to know why x11 is not so friendly? For example,
>     slurm
>     > works with MPI. Why not with X11?!
>
>     Because MPI support is fundamental, X11 support is nice to have.
>
>     I suspect 19.05 will make your life an awful lot easier!
>
>     All the best,
>     Chris
>     -- 
>        Chris Samuel  : http://www.csamuel.org/ :  Berkeley, CA, USA
>
>
>
> -- 
> Alan Orth
> alan.orth at gmail.com <mailto:alan.orth at gmail.com>
> https://picturingjordan.com
> https://englishbulgaria.net
> https://mjanja.ch
> "In heaven all the interesting people are missing." ―Friedrich Nietzsche

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190516/d18e3e72/attachment.html>


More information about the slurm-users mailing list