[slurm-users] PMIx and Slurm
pkdevel at yahoo.com
Tue Nov 28 21:52:31 MST 2017
I doubled checked and yes, you definitely want the pmix headers and libpmix library installed before you configure slurm.No need to use --with-pmix if pmix is installed in standard system locations. Configure slurm and it will see the pmix installation. After configuring slurm, but before installing it, manually remove the pmix versions of libpmi.so* and libpmi2.so*. Install slurm and use its versions of those libs. Test every mpi variant seen when you run `srun --mpi=list hostname`. You should see pmi2 and pmix in that list and several others. The pmix option will invoke a slurm plugin that is linked directly to the libpmix.so library. If you favor using the pmix versions of pmi/pmi2, sounds like you'll get better performancewhen using pmi/pmi2, but as mentioned, you would want to test every mpi variant listed to make sure everything works.
On Tuesday, November 28, 2017 9:57 PM, "rhc at open-mpi.org" <rhc at open-mpi.org> wrote:
My apologies - I guess we hadn’t been tracking it that way. I’ll try to add some clarification. We presented a nice table at the BoF and I just need to find a few minutes to post it.
I believe you do have to build slurm against PMIx so that the pmix plugin is compiled. You then also have to specify --mpi=pmix so slurm knows to use that plugin for this specific job.
You actually might be able to use the PMIx backward compatibility, and you might want to do so with slurm 17.11 and above as Mellanox did a nice job of further optimizing launch performance on IB platforms by adding fabric-based collective implementations to the pmix plugin. If you replace the slurm libpmi and libpmi2 with the ones from PMIx, what will happen is that PMI and PMI2 calls will be converted to their PMIx equivalent and passed to the pmix plugin. This lets you take advantage of what Mellanox did.
The caveat is that your MPI might ask for some PMI/PMI2 feature that we didn’t implement. We have tested with MPICH as well as OMPI and it was fine - but we cannot give you a blanket guarantee (e.g., I’m pretty sure MVAPICH won’t work). Probably safer to stick with the slurm libs for that reason unless you test to ensure it all works.
On Nov 28, 2017, at 6:42 PM, Paul Edmon <pedmon at cfa.harvard.edu> wrote:
Okay, I didn't see any note on the PMIx 2.1 page about versions of slurm it was combatible with so I assumed all of them. My bad. Thanks for the correction and the help. I just naively used the rpm spec that was packaged with PMIx which does enable the legacy support. It seems best then to let PMIx handle pmix solely and let slurm handle the rest. Thanks!Am I right in reading that you don't have to build slurm against PMIx? So it just interoperates with it fine if you just have it installed and specify pmix as the launch option? That's neat.
On 11/28/2017 6:11 PM, Philip Kovacs wrote:
Actually if you're set on installing pmix/pmix-devel from the rpms and then configuring slurm manually, you could just move the pmix-installed versions of libpmi.so* and libpmi2.so* to a safe place, configure and install slurm which will drop in its versions pf those libs and then either use the slurm versions or move the the pmix versions of libpmi and libpmi2 back into place in /usr/lib64.
On Tuesday, November 28, 2017 5:32 PM, Philip Kovacs <pkdevel at yahoo.com> wrote:
This issue is that pmi 2.0+ provides a "backward compatibility" feature, enabled by default, which installs both libpmi.so and libpmi2.so in addition to libpmix.so. The route with the least friction for you would probably be to uninstall pmix, then install slurm normally, letting it install its libpmi and libpmi2. Next configure and compile a custom pmix with that backward feature _disabled_, so it only installs libpmix.so. Slurm will "see" the pmix library after you install it and load it via its plugin when you use --mpi=pmix. Again, just use the Slurm pmi and pmi2 and install pmix separately with the backward compatible option disabled.
There is a packaging issue there in which two packages are trying to install their own versions of the same files. That should be brought to attention of the packages. Meantime you can work around it.
./configure --disable-pmi-backward-compatibility // ... etc ...
On Tuesday, November 28, 2017 4:44 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
Please see below.
2017-11-28 13:13 GMT-08:00 Paul Edmon <pedmon at cfa.harvard.edu>:
So in an effort to future proof ourselves we are trying to build Slurm against PMIx, but when I tried to do so I got the following:
Transaction check error:
file /usr/lib64/libpmi.so from install of slurm-17.02.9-1fasrc02.el7.cen tos.x86_64 conflicts with file from package pmix-2.0.2-1.el7.centos.x86_64
file /usr/lib64/libpmi2.so from install of slurm-17.02.9-1fasrc02.el7.cen tos.x86_64 conflicts with file from package pmix-2.0.2-1.el7.centos.x86_64
This is with compiling Slurm with the --with-pmix=/usr option. A few things:
1. I'm surprised when I tell it to use PMIx it still builds its own versions of libpmi and pmi2 given that PMIx handles that now.
PMIx is a plugin and from multiple perspectives it makes sense to keep the other versions available (i.e. backward compat or perf comparison)
2. Does this mean I have to install PMIx in a nondefault location? If so how does that work with user build codes? I'd rather not have multiple versions of PMI around for people to build against.
When we introduced PMIx it was in the beta stage and we didn't want to build against it by default. Now it probably makes sense to assume --with-pmix by default. I'm also thinking that we might need to solve it at the packagers level by distributing "slurm-pmix" package that is builded and depends on the pmix package that is currently shipped with particular Linux distro.
3. What is the right way of building PMIx and Slurm such that they interoperate properly?
As for now it is better to have a PMIx installed in the well-known location. And then build your MPIs or other apps against this PMIx installation. Starting (I think) from PMIx v2.1 we will have a cross-version support that will give some flexibility about what installation to use with application,
Suffice it to say little to no documentation exists on how to properly this, so any guidance would be much appreciated.
Indeed we have some problems with the documentation as PMIx technology is relatively new. Hopefully we can fix this in near future. Being the original developer of the PMIx plugin I'll be happy to answer any questions and help to resolve the issues.
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users