<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">thanks to all.</div>
<div class="moz-cite-prefix">the problem is that slurm's configure
is not able to find the pmix includes</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">configure:20846: checking for pmix
installation<br>
configure:21005: result: <br>
configure:21021: WARNING: unable to locate pmix installation</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">regardless of the path I give.</div>
<div class="moz-cite-prefix">and the reason is that configure
searches for the following includes:</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">test -f "$d/include/pmix/pmix_common.h"</div>
<div class="moz-cite-prefix">test -f "$d/include/pmix_server.h"</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">but neither of the two are installed by
openmpi.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">one of the two is in the openmpi soure
code tarball</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">./opal/mca/pmix/pmix3x/pmix/include/pmix_server.h<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">the other one is in a ".h.in" file. and
not ".h"<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">./opal/mca/pmix/pmix3x/pmix/include/pmix_common.h.in<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">anyway they do not get installed by the
rpm.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">the last thing I can try is build
directly openmpi from sources and give up with the rpm package
build. The openmpi .spec has also errors which I had to fix
manually to allow it to successfully build</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 3/12/19 4:56 PM, Daniel Letai wrote:<br>
</div>
<blockquote type="cite"
cite="mid:3ba75426-79d9-9715-765b-1557fcae9f8b@letai.org.il">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
Hi.<br>
<div class="moz-cite-prefix">On 12/03/2019 22:53:36, Riccardo
Veraldi wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFYBv87PVXp8E5nrFiyH1ntfPQycRVaAK5r-1D-eX0Q9jkDhsw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html;
charset=UTF-8">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>Hello,</div>
<div>after trynig hard for over 10 days I am forced
to write to the list.</div>
<div>I am not able to have SLURM work with openmpi.
Openmpi compiled binaries won't run on slurm,
while all non openmpi progs run just fine under
"srun". I am using SLURM 18.08.5 building the rpm
from the tarball: rpmbuild -ta
slurm-18.08.5-2.tar.bz2<br>
</div>
<div>prior to bulid SLURM I installed openmpi 4.0.0
which has built in pmix support. the pmix
libraries are in /usr/lib64/pmix/ which is the
default installation path.</div>
<div><br>
</div>
<div>The problem is that hellompi is not working if
I launch in from srun. of course it runs outside
slurm.</div>
<div><br>
</div>
<div>[psanagpu105:10995] OPAL ERROR: Not initialized
in file pmix3x_client.c at line 113<br>
--------------------------------------------------------------------------<br>
The application appears to have been direct
launched using "srun",<br>
but OMPI was not built with SLURM's PMI support
and therefore cannot<br>
execute. There are several options for building
PMI support under<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<p>I would guess (but having the config.log files would verify it)
that you should rebuild Slurm --with-pmix and then you should
rebuild OpenMPI --with Slurm.</p>
<p>Currently there might be a bug in Slurm's configure file
building PMIx support without path, so you might either modify
the spec before building (add --with-pmix=/usr to the configure
section) or for testing purposes ./configure --with-pmix=/usr;
make; make install.<br>
</p>
<p><br>
</p>
<p>It seems your current configuration has built-in mismatch -
Slurm only supports pmi2, while OpenMPI only supports PMIx. you
should build with at least one common PMI: either external PMIx
when building Slurm, or Slurm's PMI2 when building OpenMPI.</p>
<p>However, I would have expected the non-PMI option (srun
--mpi=openmpi) to work even in your env, and Slurm should have
built PMIx support automatically since it's in default search
path.<br>
</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAFYBv87PVXp8E5nrFiyH1ntfPQycRVaAK5r-1D-eX0Q9jkDhsw@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>SLURM, depending upon the SLURM version you are
using:<br>
<br>
version 16.05 or later: you can use SLURM's PMIx
support. This<br>
requires that you configure and build SLURM
--with-pmix.<br>
<br>
Versions earlier than 16.05: you must use either
SLURM's PMI-1 or<br>
PMI-2 support. SLURM builds PMI-1 by default, or
you can manually<br>
install PMI-2. You must then build Open MPI
using --with-pmi pointing<br>
to the SLURM PMI library location.<br>
<br>
Please configure as appropriate and try again.<br>
--------------------------------------------------------------------------<br>
*** An error occurred in MPI_Init<br>
*** on a NULL communicator<br>
*** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,<br>
*** and potentially your MPI job)<br>
[psanagpu105:10995] Local abort before MPI_INIT
completed completed successfully, but am not able
to aggregate error messages, and not able to
guarantee that all other processes were killed!<br>
srun: error: psanagpu105: task 0: Exited with exit
code 1<br>
</div>
<div><br>
</div>
<div>I really have no clue. I even reinstalled
openmpi on a specific different path
/opt/openmpi/4.0.0</div>
<div>anyway seems like slurm does not know how to
fine the MPI libraries even though they are there
and right now in the default path /usr/lib64</div>
<div><br>
</div>
<div>even using --mpi=pmi2 or --mpi=openmpi does not
fix the problem and the same error message is
given to me.</div>
<div>srun --mpi=list<br>
srun: MPI types are...<br>
srun: none<br>
srun: openmpi<br>
srun: pmi2<br>
<br>
</div>
<div><br>
</div>
<div>Any hint how could I fix this problem ?</div>
<div>thanks a lot</div>
<div><br>
</div>
<div>Rick</div>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
Regards,
Dani_L.</pre>
</blockquote>
<p><br>
</p>
</body>
</html>