[slurm-users] PMIx and Slurm

Philip Kovacs pkdevel at yahoo.com
Tue Nov 28 21:52:31 MST 2017


I doubled checked and yes, you definitely want the pmix headers and libpmix library installed before you configure slurm.No need to use --with-pmix if pmix is installed in standard system locations. Configure slurm and it will see the pmix installation.  After configuring slurm, but before installing it, manually remove the pmix versions of libpmi.so* and libpmi2.so*. Install slurm and use its versions of those libs.  Test every mpi variant seen when you run `srun --mpi=list hostname`.  You should see pmi2 and pmix in that list and several others.   The pmix option will invoke a slurm plugin that is linked directly to the libpmix.so library.  If you favor using the pmix versions of pmi/pmi2, sounds like you'll get better performancewhen using pmi/pmi2, but as mentioned, you would want to test every mpi variant listed to make sure everything works. 

    On Tuesday, November 28, 2017 9:57 PM, "rhc at open-mpi.org" <rhc at open-mpi.org> wrote:
 

 My apologies - I guess we hadn’t been tracking it that way. I’ll try to add some clarification. We presented a nice table at the BoF and I just need to find a few minutes to post it.
I believe you do have to build slurm against PMIx so that the pmix plugin is compiled. You then also have to specify --mpi=pmix so slurm knows to use that plugin for this specific job.
You actually might be able to use the PMIx backward compatibility, and you might want to do so with slurm 17.11 and above as Mellanox did a nice job of further optimizing launch performance on IB platforms by adding fabric-based collective implementations to the pmix plugin. If you replace the slurm libpmi and libpmi2 with the ones from PMIx, what will happen is that PMI and PMI2 calls will be converted to their PMIx equivalent and passed to the pmix plugin. This lets you take advantage of what Mellanox did.
The caveat is that your MPI might ask for some PMI/PMI2 feature that we didn’t implement. We have tested with MPICH as well as OMPI and it was fine - but we cannot give you a blanket guarantee (e.g., I’m pretty sure MVAPICH won’t work). Probably safer to stick with the slurm libs for that reason unless you test to ensure it all works.


On Nov 28, 2017, at 6:42 PM, Paul Edmon <pedmon at cfa.harvard.edu> wrote:
 
Okay, I didn't see any note on the PMIx 2.1 page about versions of slurm it was combatible with so I assumed all of them.  My bad.  Thanks for the correction and the help.  I just naively used the rpm spec that was packaged with PMIx which does enable the legacy support.  It seems best then to let PMIx handle pmix solely and let slurm handle the rest.  Thanks!Am I right in reading that you don't have to build slurm against PMIx?  So it just interoperates with it fine if you just have it installed and specify pmix as the launch option?  That's neat.
 -Paul Edmon-
  
 On 11/28/2017 6:11 PM, Philip Kovacs wrote:
  
  Actually if you're set on installing pmix/pmix-devel from the rpms and then configuring slurm manually, you could just move the pmix-installed versions of libpmi.so* and libpmi2.so* to a safe place, configure and install slurm which will drop in its versions pf those libs and then either use the slurm versions or move the the pmix versions of libpmi and libpmi2 back into place in /usr/lib64.  
 
      On Tuesday, November 28, 2017 5:32 PM, Philip Kovacs <pkdevel at yahoo.com> wrote:
  
 
        This issue is that pmi 2.0+ provides a "backward compatibility" feature, enabled by default, which installs both libpmi.so and libpmi2.so in addition to libpmix.so.  The route with the least friction for you would probably be to uninstall pmix, then install slurm normally, letting it install its libpmi and libpmi2.  Next configure and compile a custom pmix with that backward feature _disabled_, so it only installs libpmix.so.   Slurm will "see" the pmix library after you install it and load it via its plugin when you use --mpi=pmix.   Again, just use the Slurm pmi and pmi2 and  install pmix separately with the backward compatible option disabled. 
  There is a packaging issue there in which two packages are trying to install their own  versions of the same files.   That should be brought to attention of the packages.  Meantime you can work around it. 
  For PMIX: 
  ./configure --disable-pmi-backward-compatibility // ... etc ... 
        
 
    On Tuesday, November 28, 2017 4:44 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
  
 
    Hello, Paul 
  Please see below.
 
 2017-11-28 13:13 GMT-08:00 Paul Edmon <pedmon at cfa.harvard.edu>:
 
So in an effort to future proof ourselves we are  trying to build Slurm against PMIx, but when I tried to do so I got the  following:
 
 Transaction check error:
   file /usr/lib64/libpmi.so from install of  slurm-17.02.9-1fasrc02.el7.cen tos.x86_64 conflicts with file from package  pmix-2.0.2-1.el7.centos.x86_64
   file /usr/lib64/libpmi2.so from install of  slurm-17.02.9-1fasrc02.el7.cen tos.x86_64 conflicts with file from package  pmix-2.0.2-1.el7.centos.x86_64
 
 This is with compiling Slurm with the --with-pmix=/usr option.  A few things:
 
 1. I'm surprised when I tell it to use PMIx it  still builds its own versions of libpmi and pmi2 given that PMIx  handles that now.
 
 
  PMIx is a plugin and from multiple perspectives it makes sense to keep the  other versions available (i.e. backward compat or perf comparison)    
 
 2. Does this mean I have to install PMIx in a  nondefault location?  If so how does that work with user build codes?  I'd rather not have multiple versions of PMI around for  people to build against.
 
 When we introduced PMIx it was in the beta stage and we didn't want to  build against it by default. Now it probably makes sense to assume --with-pmix by default. I'm also thinking that we might need to solve it at the packagers level by  distributing "slurm-pmix" package that is builded and depends on the pmix package that is currently shipped with particular  Linux distro.   
 
 3.  What is the right way of building PMIx and Slurm such that they  interoperate properly?
 
 As for now it is better to have a PMIx installed in the well-known  location. And then build your MPIs or other apps against this PMIx installation. Starting (I think) from PMIx v2.1 we will have a cross-version support that  will give some flexibility about what installation to use with application,   
 
 Suffice it to say little to no documentation exists  on how to properly this, so any guidance would be much appreciated.
 Indeed we have some problems with the documentation as PMIx  technology is relatively new. Hopefully we can fix this in near future. Being the original developer of the PMIx plugin I'll be happy to  answer any questions and help to resolve the issues.  
  
     
 
 
 -Paul Edmon- 
 
 
 
  
 
 
  -- 
 С Уважением, Поляков Артем Юрьевич
 Best regards, Artem Y. Polyakov        
 
             
 
      
 
 


   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/66bbec7d/attachment-0001.html>


More information about the slurm-users mailing list