[slurm-users] OpenMPI interactive change in behavior?
Paul Edmon
pedmon at cfa.harvard.edu
Wed Apr 28 13:52:46 UTC 2021
I haven't experienced this issue here. Then again we've been using PMIx
for launching MPI for a while now, thus we may have circumvented this
particular issue.
-Paul Edmon-
On 4/28/2021 9:41 AM, John DeSantis wrote:
> Hello all,
>
> Just an update, the following URL almost mirrors the issue we're seeing: https://github.com/open-mpi/ompi/issues/8378
>
> But, SLURM 20.11.3 was shipped with the fix. I've verified that the changes are in the source code.
>
> We don't want to have to downgrade SLURM to 20.02.x, but it seems that this behaviour still exists. Are no other sites on fresh installs of >= SLURM 20.11.3 experiencing this problem?
>
> I was aware of the changes in 20.11.{0..2} which received a lot of scrunity, which is why 20.11.3 was selected.
>
> Thanks,
> John DeSantis
>
> On 4/26/21 5:12 PM, John DeSantis wrote:
>> Hello all,
>>
>> We've recently (don't laugh!) updated two of our SLURM installations from 16.05.10-2 to 20.11.3 and 17.11.9, respectively. Now, OpenMPI doesn't seem to function in interactive mode across multiple nodes as it did previously on the latest version 20.11.3; using `srun` and `mpirun` on a single node gives desired results, while using multiple nodes causes a hang. Jobs submitted via `sbatch` do _work as expected_.
>>
>> [desantis at sclogin0 ~]$ scontrol show config |grep VERSION; srun -n 2 -N 2-2 -t 00:05:00 --pty /bin/bash
>> SLURM_VERSION = 17.11.9
>> [desantis at sccompute0 ~]$ for OPENMPI in mpi/openmpi/1.8.5 mpi/openmpi/2.0.4 mpi/openmpi/2.0.4-psm2 mpi/openmpi/2.1.6 mpi/openmpi/3.1.6 compilers/intel/2020_cluster_xe; do module load $OPENMPI ; which mpirun; mpirun hostname; module purge; echo; done
>> /apps/openmpi/1.8.5/bin/mpirun
>> sccompute0
>> sccompute1
>>
>> /apps/openmpi/2.0.4/bin/mpirun
>> sccompute1
>> sccompute0
>>
>> /apps/openmpi/2.0.4-psm2/bin/mpirun
>> sccompute1
>> sccompute0
>>
>> /apps/openmpi/2.1.6/bin/mpirun
>> sccompute0
>> sccompute1
>>
>> /apps/openmpi/3.1.6/bin/mpirun
>> sccompute0
>> sccompute1
>>
>> /apps/intel/2020_u2/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/mpirun
>> sccompute1
>> sccompute0
>>
>>
>> 15:58:28 Mon Apr 26 <0>
>> desantis at itn0
>> [~] $ scontrol show config|grep VERSION; srun -n 2 -N 2-2 --qos=devel --partition=devel -t 00:05:00 --pty /bin/bash
>> SLURM_VERSION = 20.11.3
>> srun: job 1019599 queued and waiting for resources
>> srun: job 1019599 has been allocated resources
>> 15:58:46 Mon Apr 26 <0>
>> desantis at mdc-1057-30-1
>> [~] $ for OPENMPI in mpi/openmpi/1.8.5 mpi/openmpi/2.0.4 mpi/openmpi/2.0.4-psm2 mpi/openmpi/2.1.6 mpi/openmpi/3.1.6 compilers/intel/2020_cluster_xe; do module load $OPENMPI ; which mpirun; mpirun hostname; module purge; echo; done
>> /apps/openmpi/1.8.5/bin/mpirun
>> ^C
>> /apps/openmpi/2.0.4/bin/mpirun
>> ^C
>> /apps/openmpi/2.0.4-psm2/bin/mpirun
>> ^C
>> /apps/openmpi/2.1.6/bin/mpirun
>> ^C
>> /apps/openmpi/3.1.6/bin/mpirun
>> ^C
>> /apps/intel/2020_u2/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/mpirun
>> ^C[mpiexec at mdc-1057-30-1] Sending Ctrl-C to processes as requested
>> [mpiexec at mdc-1057-30-1] Press Ctrl-C again to force abort
>> ^C
>>
>> Our SLURM installations are fairly straight forward. We `rpmbuild` directly from the bzip2 files without any additional arguments. We've done this since we first started using SLURM with version 14.03.3-2 and through all upgrades. Due to SLURM's awesomeness(!), we've simply used the same configuration files between version changes, with the only changes being made to parameters which have been deprecated/renamed. Our "Mpi{Default,Params}" have always been sent to "none". The only real difference we're able to ascertain is that the MPI plugin for openmpi has been removed.
>>
>> svc-3024-5-2: SLURM_VERSION = 16.05.10-2
>> svc-3024-5-2: srun: MPI types are...
>> svc-3024-5-2: srun: mpi/openmpi
>> svc-3024-5-2: srun: mpi/mpich1_shmem
>> svc-3024-5-2: srun: mpi/mpichgm
>> svc-3024-5-2: srun: mpi/mvapich
>> svc-3024-5-2: srun: mpi/mpich1_p4
>> svc-3024-5-2: srun: mpi/lam
>> svc-3024-5-2: srun: mpi/none
>> svc-3024-5-2: srun: mpi/mpichmx
>> svc-3024-5-2: srun: mpi/pmi2
>>
>> viking: SLURM_VERSION = 20.11.3
>> viking: srun: MPI types are...
>> viking: srun: cray_shasta
>> viking: srun: pmi2
>> viking: srun: none
>>
>> sclogin0: SLURM_VERSION = 17.11.9
>> sclogin0: srun: MPI types are...
>> sclogin0: srun: openmpi
>> sclogin0: srun: none
>> sclogin0: srun: pmi2
>> sclogin0:
>>
>> As far as building OpenMPI, we've always withheld any SLURM specific flags, i.e. "--with-slurm", although during the build process SLURM is detected.
>>
>> Because OpenMPI was always built using this method, we never had to recompile OpenMPI after subsequent SLURM upgrades, and no cluster ready applications had to be rebuilt. The only time OpenMPI had to be rebuilt was due to OPA hardware which was a simple addition of the "--with-psm2" flag.
>>
>> It is my understanding that the openmpi plugin "never really did anything" (per perusing the mailing list), which is why it was removed. Furthermore, searching the mailing list suggests that the appropriate method is to use `salloc` first, despite version 17.11.9 not needing `salloc` for an "interactive" sessions.
>>
>> Before we go further down this rabbit hole, were other sites affected with a transition from SLURM versions 16.x,17.x,18.x(?) to versions 20.x? If so, did the methodology for multinode interactive MPI sessions change?
>>
>> Thanks!
>> John DeSantis
>>
>>
>>
>>
More information about the slurm-users
mailing list