<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">thanks to all.</div>
    <div class="moz-cite-prefix">the problem is that slurm's configure
      is not able to find the pmix includes</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">configure:20846: checking for pmix
      installation<br>
      configure:21005: result: <br>
      configure:21021: WARNING: unable to locate pmix installation</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">regardless of the path I give.</div>
    <div class="moz-cite-prefix">and the reason is that configure
      searches for the following includes:</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">test -f "$d/include/pmix/pmix_common.h"</div>
    <div class="moz-cite-prefix">test -f "$d/include/pmix_server.h"</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">but neither of the two are installed by
      openmpi.</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">one of the two is in the openmpi soure
      code tarball</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">./opal/mca/pmix/pmix3x/pmix/include/pmix_server.h<br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">the other one is in a ".h.in" file. and
      not ".h"<br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">./opal/mca/pmix/pmix3x/pmix/include/pmix_common.h.in<br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">anyway they do not get installed by the
      rpm.</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">the last thing I can try is build
      directly openmpi from sources and give up with the rpm package
      build. The openmpi .spec has also errors which I had to fix
      manually to allow it to successfully build</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">On 3/12/19 4:56 PM, Daniel Letai wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:3ba75426-79d9-9715-765b-1557fcae9f8b@letai.org.il">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <style type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; } </style>
      Hi.<br>
      <div class="moz-cite-prefix">On 12/03/2019 22:53:36, Riccardo
        Veraldi wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:CAFYBv87PVXp8E5nrFiyH1ntfPQycRVaAK5r-1D-eX0Q9jkDhsw@mail.gmail.com">
        <meta http-equiv="content-type" content="text/html;
          charset=UTF-8">
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">
                    <div>Hello,</div>
                    <div>after trynig hard for over 10 days I am forced
                      to write to the list.</div>
                    <div>I am not able to have SLURM work with openmpi.
                      Openmpi compiled binaries won't run on slurm,
                      while all non openmpi progs run just fine under
                      "srun". I am using SLURM 18.08.5 building the rpm
                      from the tarball: rpmbuild -ta
                      slurm-18.08.5-2.tar.bz2<br>
                    </div>
                    <div>prior to bulid SLURM I installed openmpi 4.0.0
                      which has built in pmix support. the pmix
                      libraries are in /usr/lib64/pmix/ which is the
                      default installation path.</div>
                    <div><br>
                    </div>
                    <div>The problem is that hellompi is not working if
                      I launch in from srun. of course it runs outside
                      slurm.</div>
                    <div><br>
                    </div>
                    <div>[psanagpu105:10995] OPAL ERROR: Not initialized
                      in file pmix3x_client.c at line 113<br>
--------------------------------------------------------------------------<br>
                      The application appears to have been direct
                      launched using "srun",<br>
                      but OMPI was not built with SLURM's PMI support
                      and therefore cannot<br>
                      execute. There are several options for building
                      PMI support under<br>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </blockquote>
      <p>I would guess (but having the config.log files would verify it)
        that you should rebuild Slurm --with-pmix and then you should
        rebuild OpenMPI --with Slurm.</p>
      <p>Currently there might be a bug in Slurm's configure file
        building PMIx support without path, so you might either modify
        the spec before building (add --with-pmix=/usr to the configure
        section) or for testing purposes ./configure --with-pmix=/usr;
        make; make install.<br>
      </p>
      <p><br>
      </p>
      <p>It seems your current configuration has built-in mismatch -
        Slurm only supports pmi2, while OpenMPI only supports PMIx. you
        should build with at least one common PMI: either external PMIx
        when building  Slurm, or Slurm's PMI2 when building OpenMPI.</p>
      <p>However, I would have expected the non-PMI option (srun
        --mpi=openmpi) to work even in your env, and Slurm should have
        built PMIx support automatically since it's in default search
        path.<br>
      </p>
      <p><br>
      </p>
      <blockquote type="cite"
cite="mid:CAFYBv87PVXp8E5nrFiyH1ntfPQycRVaAK5r-1D-eX0Q9jkDhsw@mail.gmail.com">
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">
                    <div>SLURM, depending upon the SLURM version you are
                      using:<br>
                      <br>
                        version 16.05 or later: you can use SLURM's PMIx
                      support. This<br>
                        requires that you configure and build SLURM
                      --with-pmix.<br>
                      <br>
                        Versions earlier than 16.05: you must use either
                      SLURM's PMI-1 or<br>
                        PMI-2 support. SLURM builds PMI-1 by default, or
                      you can manually<br>
                        install PMI-2. You must then build Open MPI
                      using --with-pmi pointing<br>
                        to the SLURM PMI library location.<br>
                      <br>
                      Please configure as appropriate and try again.<br>
--------------------------------------------------------------------------<br>
                      *** An error occurred in MPI_Init<br>
                      *** on a NULL communicator<br>
                      *** MPI_ERRORS_ARE_FATAL (processes in this
                      communicator will now abort,<br>
                      ***    and potentially your MPI job)<br>
                      [psanagpu105:10995] Local abort before MPI_INIT
                      completed completed successfully, but am not able
                      to aggregate error messages, and not able to
                      guarantee that all other processes were killed!<br>
                      srun: error: psanagpu105: task 0: Exited with exit
                      code 1<br>
                    </div>
                    <div><br>
                    </div>
                    <div>I really have no clue. I even reinstalled
                      openmpi on a specific different path
                      /opt/openmpi/4.0.0</div>
                    <div>anyway seems like slurm does not know how to
                      fine the MPI libraries even though they are there
                      and right now in the default path /usr/lib64</div>
                    <div><br>
                    </div>
                    <div>even using --mpi=pmi2 or --mpi=openmpi does not
                      fix the problem and the same error message is
                      given to me.</div>
                    <div>srun --mpi=list<br>
                      srun: MPI types are...<br>
                      srun: none<br>
                      srun: openmpi<br>
                      srun: pmi2<br>
                      <br>
                    </div>
                    <div><br>
                    </div>
                    <div>Any hint how could I fix this problem ?</div>
                    <div>thanks a lot</div>
                    <div><br>
                    </div>
                    <div>Rick</div>
                    <div><br>
                    </div>
                    <div><br>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </blockquote>
      <pre class="moz-signature" cols="72">-- 
Regards,

Dani_L.</pre>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>