<div dir="auto">Hey Luke,<div dir="auto"><br></div><div dir="auto">Thanks for the response. I should have mentioned I'm on debian. What's the name of the ubuntu package for pmix? I'll see if I can track down the debian equivalent. </div><div dir="auto"><br></div><div dir="auto">When you build slurm from scratch you have to place the .service files into /etc/init.d and the daemon files in /etc/systemd/system, right? When I tried building from source it didn't do that for me (even as root). Not sure if intended or if I was missing something. </div><div dir="auto"><br></div><div dir="auto">Thanks</div><div dir="auto">-ave</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 10, 2020, 1:11 PM Luke Yeager <<a href="mailto:lyeager@nvidia.com">lyeager@nvidia.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="m_-3057965264147123133WordSection1">
<p class="MsoNormal">Hi Avery,<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<ul style="margin-top:0in" type="disc">
<li class="m_-3057965264147123133MsoListParagraph" style="margin-left:0in">pmix: we just use the standard Ubuntu packages on 20.04. Unfortunately the standard packages on 18.04 are too out of date for us.<u></u><u></u></li><li class="m_-3057965264147123133MsoListParagraph" style="margin-left:0in">openmpi: we build our own, using ./configure --with-pmix=internal …<u></u><u></u></li><li class="m_-3057965264147123133MsoListParagraph" style="margin-left:0in">slurm: we build our own, using ./configure --with-pmix=PATH … (<a href="https://github.com/NVIDIA/nephele-packages/blob/42145aef4bbe2cff335a1fca222766232dab7aa7/slurm/debian/rules#L41" target="_blank" rel="noreferrer">see
here</a>)<u></u><u></u></li></ul>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Then we can set MpiDefault=pmix (<a href="https://github.com/NVIDIA/nephele/blob/1d79977164d5ef1418466bfb322d59d502c18e8f/ansible/roles/slurm/templates/etc/slurm/slurm.conf.default#L87" target="_blank" rel="noreferrer">see here</a>) and it works.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="font-family:Consolas">$ srun --mpi=list<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:Consolas">srun: MPI types are...<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:Consolas">srun: cray_shasta<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:Consolas">srun: pmi2<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:Consolas">srun: pmix_v3<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:Consolas">srun: pmix<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:Consolas">srun: none<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Hope that helps,<u></u><u></u></p>
<p class="MsoNormal">Luke<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank" rel="noreferrer">slurm-users-bounces@lists.schedmd.com</a>>
<b>On Behalf Of </b>Avery Grieve<br>
<b>Sent:</b> Thursday, December 10, 2020 7:52 AM<br>
<b>To:</b> <a href="mailto:slurm-users@lists.schedmd.com" target="_blank" rel="noreferrer">slurm-users@lists.schedmd.com</a><br>
<b>Subject:</b> [slurm-users] slurm-wlm package OpenMPI PMIx implementation<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<table border="1" cellspacing="3" cellpadding="0" style="background:#ffeb9c">
<tbody>
<tr>
<td style="padding:.75pt .75pt .75pt .75pt">
<p class="MsoNormal"><b><span style="font-size:7.5pt;font-family:"Verdana",sans-serif;color:black">External email: Use caution opening links or attachments</span></b><span style="font-size:7.5pt;font-family:"Verdana",sans-serif;color:black">
</span><u></u><u></u></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<div>
<p class="MsoNormal">Hi Forum,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I've been putting together an ARM cluster for fun/learning and I've been a bit lost about how to get OpenMPI and slurm to behave together.
<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I have installed the <a href="https://packages.debian.org/buster/slurm-wlm" target="_blank" rel="noreferrer">
slurm-wlm package </a>from the Debian apt search and compiled OpenMPI from source on my compute nodes. OpenMPI has been compiled with the option --with-slurm and the configure time log indicates openmpi has pmix v3 built in. I thought that would be enough for
slurm and calling a job with "srun -n 4 -N1 executable" (with slurm.conf having MpiDefault=pmix_v3) would be enough.
<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Not the case, unfortunately as slurm doesn't have any idea what pmix_v3 means without being compiled against it I guess. I have also attempted to compile openmpi from source with the --with-pmi option but the slurm-wlm package doesn't install
any of the libraries/headers (pmi.h pmi2.h pmix.h etc). Neither does any of the slurm-llnl develop packages, so I'm at a loss of what to do here.
<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">A few notes: OpenMPI is working across my compute nodes. I'm able to ssh to my compute node and start a job manually with mpirun that executes successfully across the nodes. My slurmctld and slurmd daemons work for single thread resource
allocation (and presumably OpenMP multithreading, though I haven't tested this).<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Beyond compiling slurm from source (assuming this installs the pmi headers that I can use to build openmpi), which I have tried with no luck on my devices, is there a way to get slurm and openmpi to behave together using the precompiled
package slurm-wlm?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Thank you,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal">~Avery Grieve<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">They/Them/Theirs please!<u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal">University of Michigan<u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote></div>