[slurm-users] Question about PMIX ERROR messages being emitted by some child of srun process

Tommi Tervo tommi.tervo at csc.fi
Mon May 22 07:16:14 UTC 2023


> So I’m testing the use of Open MPI 5.0.0 pre-release with the Slurm/PMIx setup
> currently on NERSC Perlmutter system.
> 
> The SLURM version on Perlmutter is currently 2023.02.2
> 
> The PMIx version that the admins used to build slurm against is pmix-4.2.3.
> I’ve attached the output of  pmix_info.
> 
> My test application “works” but if I use srun, I get these types of messages:
> 
> srun -n 2 -N 2 --mpi=pmix ./ring_c
> 
> [cn316:2770176] PMIX ERROR: OUT-OF-RESOURCE in file base/bfrop_base_unpack.c at
> line 750

Hi,

23.02.2 contains PMIx permission regression, it may be worth to check if it's case?

https://bugs.schedmd.com/show_bug.cgi?id=16687

commit 1f9386909230cd73506d88f02f75126924d3f41e
Author: Danny Auble <da at schedmd.com>
Date:   Mon May 15 18:35:25 2023 +0200

    mpi/pmix - fix PMIx shmem backed files permissions regression.
    
    Introduced in 23.02.2 commit d23cad68df.
    
    Bug 16687


BR,
Tommi


More information about the slurm-users mailing list