[slurm-users] Failed to launch jobs with mpirun after upgrading to Slurm 19.05

Levi Morrison levi_morrison at byu.edu
Thu Jun 6 17:15:59 UTC 2019


Slurm 19.05 removed support for `--cpu_bind`, which is what /all/ 
released versions of OpenMPI are using when they call into srun. This 
issue was fixed 24 days ago in [OpenMPI's git repo][1].

This means /all/ OpenMPI programs that end up calling `srun` on Slurm 
19.05 will fail.

This enormous amount of breakage for such a minor "gain" seems unwise. I 
think this [change][2] should be backed out and converted to a warning 
message to allow time for the OpenMPI changes to be backported, 
released, and adopted.

Levi Morrison
Brigham Young University

   [1]: 
https://github.com/open-mpi/ompi/commit/7dad74032e30259506da7fa582dd8c4351e6e0a1
   [2]: 
https://github.com/SchedMD/slurm/commit/d78af893e4a60e933a2319b0c36a0e40c7dd1b02




More information about the slurm-users mailing list