<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Phil,</p>
<p>From a distance, it feels like there may be a mismatch in Slurm
versions (an auxiliary build hiding out somewhere?). You might try
something like</p>
<p>$ which srun; srun which srun</p>
<p>Just to confirm that both the submit and execute nodes are
running the same slurm instance.</p>
<p>Andy<br>
</p>
<div class="moz-cite-prefix">On 12/7/2020 9:19 AM, Yuengling, Philip
J. wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5C7D2804-9162-4FE5-936C-58FD376D3001@jhuapl.edu">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:12.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}</style>
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt">Thanks Andy,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Slurm was
compiled with </span>
<span style="font-size:11.0pt">--with-pmix=/share/local/pmix-3.2.1.
The build of pmix is</span><span style="font-size:11.0pt">
installed under /share/local/pmix-3.2.1 which is an NFS
share across all the nodes. I should also note I used
devtoolset-10 (gcc 10) on RHEL7 and confirmed that
everything was compiled with that version of compiler.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">I also set
LD_LIBRARY_PATH to include /share/local/pmix-3.2.1<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Cheers!<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Phil<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="color:black">From: </span></b><span
style="color:black">slurm-users
<a class="moz-txt-link-rfc2396E" href="mailto:slurm-users-bounces@lists.schedmd.com"><slurm-users-bounces@lists.schedmd.com></a> on behalf of
Andy Riebs <a class="moz-txt-link-rfc2396E" href="mailto:andy@candooz.com"><andy@candooz.com></a><br>
<b>Reply-To: </b><a class="moz-txt-link-rfc2396E" href="mailto:andy@candooz.com">"andy@candooz.com"</a>
<a class="moz-txt-link-rfc2396E" href="mailto:andy@candooz.com"><andy@candooz.com></a>, Slurm User Community List
<a class="moz-txt-link-rfc2396E" href="mailto:slurm-users@lists.schedmd.com"><slurm-users@lists.schedmd.com></a><br>
<b>Date: </b>Friday, December 4, 2020 at 3:07 PM<br>
<b>To: </b><a class="moz-txt-link-rfc2396E" href="mailto:slurm-users@lists.schedmd.com">"slurm-users@lists.schedmd.com"</a>
<a class="moz-txt-link-rfc2396E" href="mailto:slurm-users@lists.schedmd.com"><slurm-users@lists.schedmd.com></a><br>
<b>Subject: </b>[EXT] Re: [slurm-users] pmix issue<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
</div>
<div>
<div id="APLWarningText">
<table class="MsoNormalTable" cellspacing="0"
cellpadding="0" border="0" align="left">
<tbody>
<tr>
<td style="width:100.0%;background:#E0E0E0;padding:0in
0in 0in 0in" width="100%">
<p class="MsoNormal"
style="mso-element:frame;mso-element-frame-hspace:2.25pt;mso-element-wrap:around;mso-element-anchor-vertical:paragraph;mso-element-anchor-horizontal:column;mso-height-rule:exactly"><b><span
style="color:red">APL external email warning:
</span></b><span style="color:black">Verify
sender <a class="moz-txt-link-abbreviated" href="mailto:slurm-users-bounces@lists.schedmd.com">slurm-users-bounces@lists.schedmd.com</a>
before clicking links or attachments</span><o:p></o:p></p>
</td>
</tr>
</tbody>
</table>
<p><span style="color:white"> </span><o:p></o:p></p>
</div>
</div>
<p>Also, Slurm was built with "/fs/local/pmix-3.2.1" -- does
that translate well to "/share/local/pmix-3.2.1"?<o:p></o:p></p>
<p>Andy<o:p></o:p></p>
<div>
<p class="MsoNormal">On 12/4/2020 2:59 PM, Andy Riebs wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p>Are you sure that /share/local/pmix-3.2.1 exists on the
compute nodes?<o:p></o:p></p>
<div>
<p class="MsoNormal">On 12/4/2020 2:54 PM, Yuengling, Philip
J. wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt">Hi
everyone,</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">I’ve
been having difficulty getting the --mpi=pmix_v3
option to work for me. I can get --mpi=pmi2 to work
ok, but I really want to understand what I’m doing
wrong here. Everything seems to build ok.</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">$ srun
--mpi=list</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
MPI types are...</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
pmix</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
pmix_v3</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
cray_shasta</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
none</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
pmi2</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">$ srun
--mpi=pmix_v3 -N5 date</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
error: task 1 launch failed: Invalid MPI plugin name</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
error: task 2 launch failed: Invalid MPI plugin name</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
error: task 3 launch failed: Invalid MPI plugin name</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
error: task 4 launch failed: Invalid MPI plugin name</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">srun:
error: task 0 launch failed: Invalid MPI plugin name</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">$ srun
--mpi=pmi2 -N5 date</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Fri
Dec 4 13:52:39 EST 2020</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Fri
Dec 4 13:52:39 EST 2020</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Fri
Dec 4 13:52:39 EST 2020</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Fri
Dec 4 13:52:39 EST 2020</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Fri
Dec 4 13:52:39 EST 2020</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">openpmix:</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">CC=/opt/rh/devtoolset-10/root/usr/bin/gcc
./configure --prefix=/share/local/pmix-3.2.1
--with-hwloc=/share/local/hwloc-2.4.0</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Slurm
20.11.0:</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">rpmbuild
--define "_with_pmix --with-pmix=/fs/local/pmix-3.2.1"
-ta slurm-20.11.0.tar.bz2</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">From
config.log:</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">./configure
--build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr
--exec-prefix=/usr --bindir=/usr/bin
--sbindir=/usr/sbin --sysconfdir=/etc/slurm
--datadir=/usr/share --includedir=/usr/include
--libdir=/usr/lib64 --libexecdir=/usr/libexec
--localstatedir=/var --sharedstatedir=/var/lib
--mandir=/usr/share/man --infodir=/usr/share/info
--with-pmix=/fs/local/pmix-3.2.1 --disable-slurmrestd</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Open
MP 4.0.5: </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt">./configure
'--prefix=/share/openmpi-4.0.5' '--with-cuda'
'--with-pmix=/share/local/pmix-3.2.1'
'--with-pmi=/usr' '--with-slurm' '--without-ucx'
'--without-verbs'</span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:5.0pt;color:white">-- </span><o:p></o:p></p>
<div>
<p class="MsoNormal"><span
style="font-size:10.0pt;color:black"> </span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;color:black">Philip J.
Yuengling</span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;color:black">Johns Hopkins
University</span><o:p></o:p></p>
</div>
</div>
</blockquote>
</blockquote>
<p class="MsoNormal"><span style="font-size:11.0pt">--> <o:p></o:p></span></p>
</div>
</blockquote>
</body>
</html>