[slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

Thu Dec 7 11:05:05 MST 2017

> On Dec 7, 2017, at 12:51 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
> 
> also please post the output of
> $ srun --mpi=list

[gwolosh at p-slogin bin]$ srun --mpi=list
srun: MPI types are...
srun: mpi/mpich1_shmem
srun: mpi/mpich1_p4
srun: mpi/lam
srun: mpi/openmpi
srun: mpi/none
srun: mpi/mvapich
srun: mpi/mpichmx
srun: mpi/pmi2
srun: mpi/mpichgm

> 
> When job crashes - is there any error messages in the relevant slurmd.log's or output on the screen?

on screen —

[snode4][[274,1],24][connect/btl_openib_connect_udcm.c:1448:udcm_wait_for_send_completion] send failed with verbs status 2
[snode4:5175] *** An error occurred in MPI_Bcast
[snode4:5175] *** reported by process [17956865,24]
[snode4:5175] *** on communicator MPI_COMM_WORLD
[snode4:5175] *** MPI_ERR_OTHER: known error not in list
[snode4:5175] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[snode4:5175] ***    and potentially your MPI job)
mlx4: local QP operation err (QPN 0005f3, WQE index 40000, vendor syndrome 6c, opcode = 5e)
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
[snode4][[274,1],31][connect/btl_openib_connect_udcm.c:1448:udcm_wait_for_send_completion] send failed with verbs status 2
slurmstepd: error: *** STEP 274.0 ON snode1 CANCELLED AT 2017-12-07T12:55:46 ***
[snode4:5182] *** An error occurred in MPI_Bcast
[snode4:5182] *** reported by process [17956865,31]
[snode4:5182] *** on communicator MPI_COMM_WORLD
[snode4:5182] *** MPI_ERR_OTHER: known error not in list
[snode4:5182] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[snode4:5182] ***    and potentially your MPI job)
mlx4: local QP operation err (QPN 0005f7, WQE index 40000, vendor syndrome 6c, opcode = 5e)
[snode4][[274,1],27][connect/btl_openib_connect_udcm.c:1448:udcm_wait_for_send_completion] send failed with verbs status 2
[snode4:5178] *** An error occurred in MPI_Bcast
[snode4:5178] *** reported by process [17956865,27]
[snode4:5178] *** on communicator MPI_COMM_WORLD
[snode4:5178] *** MPI_ERR_OTHER: known error not in list
[snode4:5178] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[snode4:5178] ***    and potentially your MPI job)
mlx4: local QP operation err (QPN 0005fa, WQE index 40000, vendor syndrome 6c, opcode = 5e)
srun: error: snode4: tasks 24,31: Exited with exit code 16
srun: error: snode4: tasks 25-30: Killed
srun: error: snode5: tasks 32-39: Killed
srun: error: snode3: tasks 16-23: Killed
srun: error: snode8: tasks 56-63: Killed
srun: error: snode7: tasks 48-55: Killed
srun: error: snode1: tasks 0-7: Killed
srun: error: snode2: tasks 8-15: Killed
srun: error: snode6: tasks 40-47: Killed

Nothing striking in the slurmd log

> 
> 2017-12-07 9:49 GMT-08:00 Artem Polyakov <artpol84 at gmail.com <mailto:artpol84 at gmail.com>>:
> Hello,
> 
> what is the value of MpiDefault option in your Slurm configuration file?
> 
> 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh <gwolosh at njit.edu <mailto:gwolosh at njit.edu>>:
> Hello
> 
> This is using Slurm version - 17.02.6 running on Scientific Linux release 7.4 (Nitrogen)
> 
> [gwolosh at p-slogin bin]$ module li
> 
> Currently Loaded Modules:
>   1) GCCcore/.5.4.0 (H)   2) binutils/.2.26 (H)   3) GCC/5.4.0-2.26   4) numactl/2.0.11   5) hwloc/1.11.3   6) OpenMPI/1.10.3
> 
> If I run
> 
> srun --nodes=8 --ntasks-per-node=8 --ntasks=64  ./ep.C.64
> 
> It runs successfuly but I get a message —
> 
> PMI2 initialized but returned bad values for size/rank/jobid.
> This is symptomatic of either a failure to use the
> "--mpi=pmi2" flag in SLURM, or a borked PMI2 installation.
> If running under SLURM, try adding "-mpi=pmi2" to your
> srun command line. If that doesn't work, or if you are
> not running under SLURM, try removing or renaming the
> pmi2.h header file so PMI2 support will not automatically
> be built, reconfigure and build OMPI, and then try again
> with only PMI1 support enabled.
> 
> If I run
> 
> srun --nodes=8 --ntasks-per-node=8 --ntasks=64  —mpi=pmi2 ./ep.C.64
> 
> The job crashes
> 
> If I run via sbatch —
> 
> #!/bin/bash
> # Job name:
> #SBATCH --job-name=nas_bench
> #SBATCH --nodes=8
> #SBATCH --ntasks=64
> #SBATCH --ntasks-per-node=8
> #SBATCH --time=48:00:00
> #SBATCH --output=nas.out.1
> #
> ## Command(s) to run (example):
> module use $HOME/easybuild/modules/all/Core
> module load GCC/5.4.0-2.26 OpenMPI/1.10.3
> mpirun -np 64  ./ep.C.64
> 
> the job crashes
> 
> Using easybuild, these are my config options for ompi —
> 
> configopts = '--with-threads=posix --enable-shared --enable-mpi-thread-multiple --with-verbs '
> configopts += '--enable-mpirun-prefix-by-default '  # suppress failure modes in relation to mpirun path
> configopts += '--with-hwloc=$EBROOTHWLOC '  # hwloc support
> configopts += '--disable-dlopen '  # statically link component, don't do dynamic loading
> configopts += '--with-slurm --with-pmi ‘
> 
> And finally —
> 
> $ ldd /opt/local/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/bin/orterun | grep pmi
>         libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x00007f0129d6d000)
>         libpmi2.so.0 => /usr/lib64/libpmi2.so.0 (0x00007f0129b51000)
> 
> $ ompi_info | grep pmi
>                   MCA db: pmi (MCA v2.0.0, API v1.0.0, Component v1.10.3)
>                  MCA ess: pmi (MCA v2.0.0, API v3.0.0, Component v1.10.3)
>              MCA grpcomm: pmi (MCA v2.0.0, API v2.0.0, Component v1.10.3)
>               MCA pubsub: pmi (MCA v2.0.0, API v2.0.0, Component v1.10.3)
> 
> 
> Any suggestions?
> _______________
> Gedaliah Wolosh
> IST Academic and Research Computing Systems (ARCS)
> NJIT
> GITC 2203
> 973 596 5437 <tel:(973)%20596-5437>
> gwolosh at njit.edu <mailto:gwolosh at njit.edu>
> 
> 
> 
> 
> -- 
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov
> 
> 
> 
> -- 
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171207/d96d3421/attachment.html>