[slurm-users] [EXTERNAL] OpenMPI and Slurm clarification?
Craig
cfreese at super.org
Mon Mar 27 19:51:16 UTC 2023
conf.log...
checking if user requested PMI support
result: no
checking if user requested internal PMIx support(yes)
result: no
checking for pmix.h in /usr
result: not found
checking for pmix.h in /usr/include
result: not found
WARNING: discovered external PMIx version is less than internal
version 3.x
WARNING: using internal PMIx
So is looks like it used the internal version (which is what I was
aiming for) and that's ok by me since it seems to be working, but if I'm
really supposed to be using the same one that SLURM used then I'm gonna
have to figure out a way to determine what that was/is.
On 3/27/23 15:28, Pritchard Jr., Howard wrote:
>
> HI Craig,
>
> Your use of the –with-pmix on the open mpi configure line is
> important. Without any args to this configure option open mpi
> configure will first check if there’s an external pmix which is newer
> than the one that is included in the openmpi release tarball. If it
> is not, the internal pmix is built.
>
> You can check in the config.log whether the internal PMix or an
> external one was used.
>
> If you want to be extra careful, find the location of the PMIx v3 used
> to build the SLURM PMIx plugin, and then rebuild your open mpi 4.1.5 with
>
> ./configure … --with-pmix=path_to_pmix_used_for_slurm_pmix_plugin_build ….
>
> But you may be okay without doing this. You can check this by running
> your open mpi job with
>
> srun –mpi=pmix_v3 -N2 foo
>
> and see if it behaves as expected.
>
> I’m not sure what the “openmpi” result from srun –mpi=list is about.
>
> Howard
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf
> of Craig <cfreese at super.org>
> *Reply-To: *Slurm User Community List <slurm-users at lists.schedmd.com>
> *Date: *Monday, March 27, 2023 at 12:54 PM
> *To: *"slurm-users at lists.schedmd.com" <slurm-users at lists.schedmd.com>
> *Subject: *Re: [slurm-users] [EXTERNAL] OpenMPI and Slurm clarification?
>
> srun: MPI types are...
>
> srun: none
>
> srun: openmpi
>
> srun: pmix_v3
>
> srun: pmi2
>
> srun: pmix
>
> but I'm not sure that tells me much about how I am supposed to be
> building OpenMPI?
>
> On 3/27/23 14:41, Pritchard Jr., Howard wrote:
>
> HI Craig,
>
> If you run
>
> srun –mpi=list
>
> what does slurm report?
>
> That will help in determining what argument you want to supply for
> the –mpi srun option.
>
> Howard
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com>
> <mailto:slurm-users-bounces at lists.schedmd.com> on behalf of Craig
> <cfreese at super.org> <mailto:cfreese at super.org>
> *Reply-To: *Slurm User Community List
> <slurm-users at lists.schedmd.com> <mailto:slurm-users at lists.schedmd.com>
> *Date: *Monday, March 27, 2023 at 12:38 PM
> *To: *"slurm-users at lists.schedmd.com"
> <mailto:slurm-users at lists.schedmd.com>
> <slurm-users at lists.schedmd.com> <mailto:slurm-users at lists.schedmd.com>
> *Subject: *[EXTERNAL] [slurm-users] OpenMPI and Slurm clarification?
>
>
> Can someone please clarify the "best practices" for building
> OpenMPI compatible with Slurm?
>
> https://slurm.schedmd.com/mpi_guide.html#open_mpi
> <https://urldefense.com/v3/__https:/slurm.schedmd.com/mpi_guide.html*open_mpi__;Iw!!Bt8fGhp8LhKGRg!Cb86a2IwxgqfT5fv1_eEByDpAyhly3ZdN6Wwl7Wod9FRPx9HBpvFVojIRgu5oSpti_3jOXhNyvJqEMGs$>
> tells me what I _can_ do but I'm unclear as to what I _should_ do.
>
> I've built OpenMPI 4.1.5 with: --with-pmix
> --with-libevent=internal --with-hwloc=internal --with-slurm. If
> I run an MPI program on my cluster (slurm 18.08.8) with "srun -N2
> foo" it seems to work fine. (slurm.conf has MpiDefault=pmix).
>
> If I "srun --mpi=openmpi -N2 foo" it chokes with:
>
> OPAL_ERROR: Unreachable in file
> ../../../../../opal/mca/pmix/pmix3/pmix3x_client.c at line 112
> -------------------------------------------------------------------------------------------------------------------
> This application appears to have been direct launched using
> "srun",
> but OMPI was not build with SLURM's PMI support and therefore
> cannot
> execute. There are several options for building PMI support under
> SLURM, depending upon the SLURM version you are using:
>
> version 16.05 or later: you can use SLURM's PMIx support. THis
> require that you configure and uild SLURM --with-pmix.
> .
> .
> .
>
>
> So I guess the question is, what is the "right" way to build
> OpenMPI with Slurm. Is the fact that my non-Slurm pmix works
> "correct" or am I just getting lucky that the various software I
> have just happens to be compatible. If I build OpenMPI am I
> supposed to use Slurm's pmix/libevent/hwloc or is that optional.
> If it's optional when/why might I choose to do so. If I need
> Slurm's versions is there some way to find which
> pmix/libevent/hwloc my current Slurm install is using? Note: my
> sysadmins are not going to be helpful as they think Slurm 18 and
> OpenMPI 4.0.2a is adequate for users' needs :^(.
>
> I like the idea of _not_ tying my OpenMPI to the installed Slurm
> just in case our support people ever decide to upgrade system
> software.
>
> Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230327/43f6baba/attachment.htm>
More information about the slurm-users
mailing list