[slurm-users] Failed to launch jobs with mpirun after upgrading to Slurm 19.05
Levi Morrison
levi_morrison at byu.edu
Thu Jun 6 17:15:59 UTC 2019
Slurm 19.05 removed support for `--cpu_bind`, which is what /all/
released versions of OpenMPI are using when they call into srun. This
issue was fixed 24 days ago in [OpenMPI's git repo][1].
This means /all/ OpenMPI programs that end up calling `srun` on Slurm
19.05 will fail.
This enormous amount of breakage for such a minor "gain" seems unwise. I
think this [change][2] should be backed out and converted to a warning
message to allow time for the OpenMPI changes to be backported,
released, and adopted.
Levi Morrison
Brigham Young University
[1]:
https://github.com/open-mpi/ompi/commit/7dad74032e30259506da7fa582dd8c4351e6e0a1
[2]:
https://github.com/SchedMD/slurm/commit/d78af893e4a60e933a2319b0c36a0e40c7dd1b02
More information about the slurm-users
mailing list