[slurm-users] PMIx and Slurm

rhc at open-mpi.org rhc at open-mpi.org
Tue Nov 28 19:55:10 MST 2017


My apologies - I guess we hadn’t been tracking it that way. I’ll try to add some clarification. We presented a nice table at the BoF and I just need to find a few minutes to post it.

I believe you do have to build slurm against PMIx so that the pmix plugin is compiled. You then also have to specify --mpi=pmix so slurm knows to use that plugin for this specific job.

You actually might be able to use the PMIx backward compatibility, and you might want to do so with slurm 17.11 and above as Mellanox did a nice job of further optimizing launch performance on IB platforms by adding fabric-based collective implementations to the pmix plugin. If you replace the slurm libpmi and libpmi2 with the ones from PMIx, what will happen is that PMI and PMI2 calls will be converted to their PMIx equivalent and passed to the pmix plugin. This lets you take advantage of what Mellanox did.

The caveat is that your MPI might ask for some PMI/PMI2 feature that we didn’t implement. We have tested with MPICH as well as OMPI and it was fine - but we cannot give you a blanket guarantee (e.g., I’m pretty sure MVAPICH won’t work). Probably safer to stick with the slurm libs for that reason unless you test to ensure it all works.


> On Nov 28, 2017, at 6:42 PM, Paul Edmon <pedmon at cfa.harvard.edu> wrote:
> 
> Okay, I didn't see any note on the PMIx 2.1 page about versions of slurm it was combatible with so I assumed all of them.  My bad.  Thanks for the correction and the help.  I just naively used the rpm spec that was packaged with PMIx which does enable the legacy support.  It seems best then to let PMIx handle pmix solely and let slurm handle the rest.  Thanks!
> 
> Am I right in reading that you don't have to build slurm against PMIx?  So it just interoperates with it fine if you just have it installed and specify pmix as the launch option?  That's neat.
> -Paul Edmon-
> 
> On 11/28/2017 6:11 PM, Philip Kovacs wrote:
>> Actually if you're set on installing pmix/pmix-devel from the rpms and then configuring slurm manually,
>> you could just move the pmix-installed versions of libpmi.so* and libpmi2.so* to a safe place, configure
>> and install slurm which will drop in its versions pf those libs and then either use the slurm versions or move
>> the the pmix versions of libpmi and libpmi2 back into place in /usr/lib64. 
>> 
>> 
>> On Tuesday, November 28, 2017 5:32 PM, Philip Kovacs <pkdevel at yahoo.com> <mailto:pkdevel at yahoo.com> wrote:
>> 
>> 
>> This issue is that pmi 2.0+ provides a "backward compatibility" feature, enabled by default, which installs
>> both libpmi.so and libpmi2.so in addition to libpmix.so.  The route with the least friction for you would probably
>> be to uninstall pmix, then install slurm normally, letting it install its libpmi and libpmi2.  Next configure and compile
>> a custom pmix with that backward feature _disabled_, so it only installs libpmix.so.   Slurm will "see" the pmix library
>> after you install it and load it via its plugin when you use --mpi=pmix.   Again, just use the Slurm pmi and pmi2 and 
>> install pmix separately with the backward compatible option disabled.
>> 
>> There is a packaging issue there in which two packages are trying to install their own versions of the same files.  
>> That should be brought to attention of the packages.  Meantime you can work around it.
>> 
>> For PMIX:
>> 
>> ./configure --disable-pmi-backward-compatibility // ... etc ...
>> 
>> 
>> 
>> On Tuesday, November 28, 2017 4:44 PM, Artem Polyakov <artpol84 at gmail.com> <mailto:artpol84 at gmail.com> wrote:
>> 
>> 
>> Hello, Paul
>> 
>> Please see below.
>> 
>> 2017-11-28 13:13 GMT-08:00 Paul Edmon <pedmon at cfa.harvard.edu <mailto:pedmon at cfa.harvard.edu>>:
>> So in an effort to future proof ourselves we are trying to build Slurm against PMIx, but when I tried to do so I got the following:
>> 
>> Transaction check error:
>>   file /usr/lib64/libpmi.so from install of slurm-17.02.9-1fasrc02.el7.cen tos.x86_64 conflicts with file from package pmix-2.0.2-1.el7.centos.x86_64
>>   file /usr/lib64/libpmi2.so from install of slurm-17.02.9-1fasrc02.el7.cen tos.x86_64 conflicts with file from package pmix-2.0.2-1.el7.centos.x86_64
>> 
>> This is with compiling Slurm with the --with-pmix=/usr option.  A few things:
>> 
>> 1. I'm surprised when I tell it to use PMIx it still builds its own versions of libpmi and pmi2 given that PMIx handles that now.
>> 
>> PMIx is a plugin and from multiple perspectives it makes sense to keep the other versions available (i.e. backward compat or perf comparison) 
>>  
>> 
>> 2. Does this mean I have to install PMIx in a nondefault location?  If so how does that work with user build codes?  I'd rather not have multiple versions of PMI around for people to build against.
>> When we introduced PMIx it was in the beta stage and we didn't want to build against it by default. Now it probably makes sense to assume --with-pmix by default.
>> I'm also thinking that we might need to solve it at the packagers level by distributing "slurm-pmix" package that is builded and depends on the pmix package that is currently shipped with particular Linux distro.
>>  
>> 
>> 3.  What is the right way of building PMIx and Slurm such that they interoperate properly?
>> As for now it is better to have a PMIx installed in the well-known location. And then build your MPIs or other apps against this PMIx installation.
>> Starting (I think) from PMIx v2.1 we will have a cross-version support that will give some flexibility about what installation to use with application,
>>  
>> 
>> Suffice it to say little to no documentation exists on how to properly this, so any guidance would be much appreciated.
>> Indeed we have some problems with the documentation as PMIx technology is relatively new. Hopefully we can fix this in near future.
>> Being the original developer of the PMIx plugin I'll be happy to answer any questions and help to resolve the issues.
>> 
>> 
>>  
>> 
>> 
>> -Paul Edmon-
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> С Уважением, Поляков Артем Юрьевич
>> Best regards, Artem Y. Polyakov
>> 
>> 
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171128/4da2be85/attachment-0001.html>


More information about the slurm-users mailing list