<html style="direction: ltr;">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
</head>
<body bidimailui-charset-is-forced="true" style="direction: ltr;"
text="#000000" bgcolor="#FFFFFF">
Hi.<br>
<div class="moz-cite-prefix">On 12/03/2019 22:53:36, Riccardo
Veraldi wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFYBv87PVXp8E5nrFiyH1ntfPQycRVaAK5r-1D-eX0Q9jkDhsw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>Hello,</div>
<div>after trynig hard for over 10 days I am forced to
write to the list.</div>
<div>I am not able to have SLURM work with openmpi.
Openmpi compiled binaries won't run on slurm, while
all non openmpi progs run just fine under "srun". I
am using SLURM 18.08.5 building the rpm from the
tarball: rpmbuild -ta slurm-18.08.5-2.tar.bz2<br>
</div>
<div>prior to bulid SLURM I installed openmpi 4.0.0
which has built in pmix support. the pmix libraries
are in /usr/lib64/pmix/ which is the default
installation path.</div>
<div><br>
</div>
<div>The problem is that hellompi is not working if I
launch in from srun. of course it runs outside
slurm.</div>
<div><br>
</div>
<div>[psanagpu105:10995] OPAL ERROR: Not initialized
in file pmix3x_client.c at line 113<br>
--------------------------------------------------------------------------<br>
The application appears to have been direct launched
using "srun",<br>
but OMPI was not built with SLURM's PMI support and
therefore cannot<br>
execute. There are several options for building PMI
support under<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<p>I would guess (but having the config.log files would verify it)
that you should rebuild Slurm --with-pmix and then you should
rebuild OpenMPI --with Slurm.</p>
<p>Currently there might be a bug in Slurm's configure file building
PMIx support without path, so you might either modify the spec
before building (add --with-pmix=/usr to the configure section) or
for testing purposes ./configure --with-pmix=/usr; make; make
install.<br>
</p>
<p><br>
</p>
<p>It seems your current configuration has built-in mismatch - Slurm
only supports pmi2, while OpenMPI only supports PMIx. you should
build with at least one common PMI: either external PMIx when
building Slurm, or Slurm's PMI2 when building OpenMPI.</p>
<p>However, I would have expected the non-PMI option (srun
--mpi=openmpi) to work even in your env, and Slurm should have
built PMIx support automatically since it's in default search
path.<br>
</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:CAFYBv87PVXp8E5nrFiyH1ntfPQycRVaAK5r-1D-eX0Q9jkDhsw@mail.gmail.com">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>SLURM, depending upon the SLURM version you are
using:<br>
<br>
version 16.05 or later: you can use SLURM's PMIx
support. This<br>
requires that you configure and build SLURM
--with-pmix.<br>
<br>
Versions earlier than 16.05: you must use either
SLURM's PMI-1 or<br>
PMI-2 support. SLURM builds PMI-1 by default, or
you can manually<br>
install PMI-2. You must then build Open MPI using
--with-pmi pointing<br>
to the SLURM PMI library location.<br>
<br>
Please configure as appropriate and try again.<br>
--------------------------------------------------------------------------<br>
*** An error occurred in MPI_Init<br>
*** on a NULL communicator<br>
*** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,<br>
*** and potentially your MPI job)<br>
[psanagpu105:10995] Local abort before MPI_INIT
completed completed successfully, but am not able to
aggregate error messages, and not able to guarantee
that all other processes were killed!<br>
srun: error: psanagpu105: task 0: Exited with exit
code 1<br>
</div>
<div><br>
</div>
<div>I really have no clue. I even reinstalled openmpi
on a specific different path /opt/openmpi/4.0.0</div>
<div>anyway seems like slurm does not know how to fine
the MPI libraries even though they are there and
right now in the default path /usr/lib64</div>
<div><br>
</div>
<div>even using --mpi=pmi2 or --mpi=openmpi does not
fix the problem and the same error message is given
to me.</div>
<div>srun --mpi=list<br>
srun: MPI types are...<br>
srun: none<br>
srun: openmpi<br>
srun: pmi2<br>
<br>
</div>
<div><br>
</div>
<div>Any hint how could I fix this problem ?</div>
<div>thanks a lot</div>
<div><br>
</div>
<div>Rick</div>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
Regards,
Dani_L.</pre>
</body>
</html>