[slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

Artem Polyakov artpol84 at gmail.com
Thu Dec 7 11:18:24 MST 2017


Couple of things to try to locate the issue:

1. To make sure that PMI is not working: have you tried to run something
simple (like hello_world (
https://github.com/open-mpi/ompi/blob/master/examples/hello_c.c) and ring (
https://github.com/open-mpi/ompi/blob/master/examples/ring_c.c). Please try
to run those two and post the results.
2. If hello is working and ring is not can you try to change the fabric to
TCP:
$ export OMPI_MCA_btl=tcp,self
$ export OMPI_MCA_pml=ob1
$ srun ...

Please provide the outputs

2017-12-07 10:05 GMT-08:00 Glenn (Gedaliah) Wolosh <gwolosh at njit.edu>:

>
>
> On Dec 7, 2017, at 12:51 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
>
> also please post the output of
> $ srun --mpi=list
>
>
> [gwolosh at p-slogin bin]$ srun --mpi=list
> srun: MPI types are...
> srun: mpi/mpich1_shmem
> srun: mpi/mpich1_p4
> srun: mpi/lam
> srun: mpi/openmpi
> srun: mpi/none
> srun: mpi/mvapich
> srun: mpi/mpichmx
> srun: mpi/pmi2
> srun: mpi/mpichgm
>
>
>
> When job crashes - is there any error messages in the relevant
> slurmd.log's or output on the screen?
>
>
> on screen —
>
> [snode4][[274,1],24][connect/btl_openib_connect_udcm.c:
> 1448:udcm_wait_for_send_completion] send failed with verbs status 2
> [snode4:5175] *** An error occurred in MPI_Bcast
> [snode4:5175] *** reported by process [17956865,24]
> [snode4:5175] *** on communicator MPI_COMM_WORLD
> [snode4:5175] *** MPI_ERR_OTHER: known error not in list
> [snode4:5175] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
> [snode4:5175] ***    and potentially your MPI job)
> mlx4: local QP operation err (QPN 0005f3, WQE index 40000, vendor syndrome
> 6c, opcode = 5e)
> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> [snode4][[274,1],31][connect/btl_openib_connect_udcm.c:
> 1448:udcm_wait_for_send_completion] send failed with verbs status 2
> slurmstepd: error: *** STEP 274.0 ON snode1 CANCELLED AT
> 2017-12-07T12:55:46 ***
> [snode4:5182] *** An error occurred in MPI_Bcast
> [snode4:5182] *** reported by process [17956865,31]
> [snode4:5182] *** on communicator MPI_COMM_WORLD
> [snode4:5182] *** MPI_ERR_OTHER: known error not in list
> [snode4:5182] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
> [snode4:5182] ***    and potentially your MPI job)
> mlx4: local QP operation err (QPN 0005f7, WQE index 40000, vendor syndrome
> 6c, opcode = 5e)
> [snode4][[274,1],27][connect/btl_openib_connect_udcm.c:
> 1448:udcm_wait_for_send_completion] send failed with verbs status 2
> [snode4:5178] *** An error occurred in MPI_Bcast
> [snode4:5178] *** reported by process [17956865,27]
> [snode4:5178] *** on communicator MPI_COMM_WORLD
> [snode4:5178] *** MPI_ERR_OTHER: known error not in list
> [snode4:5178] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
> [snode4:5178] ***    and potentially your MPI job)
> mlx4: local QP operation err (QPN 0005fa, WQE index 40000, vendor syndrome
> 6c, opcode = 5e)
> srun: error: snode4: tasks 24,31: Exited with exit code 16
> srun: error: snode4: tasks 25-30: Killed
> srun: error: snode5: tasks 32-39: Killed
> srun: error: snode3: tasks 16-23: Killed
> srun: error: snode8: tasks 56-63: Killed
> srun: error: snode7: tasks 48-55: Killed
> srun: error: snode1: tasks 0-7: Killed
> srun: error: snode2: tasks 8-15: Killed
> srun: error: snode6: tasks 40-47: Killed
>
> Nothing striking in the slurmd log
>
>
>
> 2017-12-07 9:49 GMT-08:00 Artem Polyakov <artpol84 at gmail.com>:
>
>> Hello,
>>
>> what is the value of MpiDefault option in your Slurm configuration file?
>>
>> 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh <gwolosh at njit.edu>:
>>
>>> Hello
>>>
>>> This is using Slurm version - 17.02.6 running on Scientific Linux
>>> release 7.4 (Nitrogen)
>>>
>>> [gwolosh at p-slogin bin]$ module li
>>>
>>> Currently Loaded Modules:
>>>   1) GCCcore/.5.4.0 (H)   2) binutils/.2.26 (H)   3) GCC/5.4.0-2.26   4)
>>> numactl/2.0.11   5) hwloc/1.11.3   6) OpenMPI/1.10.3
>>>
>>> If I run
>>>
>>> srun --nodes=8 --ntasks-per-node=8 --ntasks=64  ./ep.C.64
>>>
>>> It runs successfuly but I get a message —
>>>
>>> PMI2 initialized but returned bad values for size/rank/jobid.
>>> This is symptomatic of either a failure to use the
>>> "--mpi=pmi2" flag in SLURM, or a borked PMI2 installation.
>>> If running under SLURM, try adding "-mpi=pmi2" to your
>>> srun command line. If that doesn't work, or if you are
>>> not running under SLURM, try removing or renaming the
>>> pmi2.h header file so PMI2 support will not automatically
>>> be built, reconfigure and build OMPI, and then try again
>>> with only PMI1 support enabled.
>>>
>>> If I run
>>>
>>> srun --nodes=8 --ntasks-per-node=8 --ntasks=64  —mpi=pmi2 ./ep.C.64
>>>
>>> The job crashes
>>>
>>> If I run via sbatch —
>>>
>>> #!/bin/bash
>>> # Job name:
>>> #SBATCH --job-name=nas_bench
>>> #SBATCH --nodes=8
>>> #SBATCH --ntasks=64
>>> #SBATCH --ntasks-per-node=8
>>> #SBATCH --time=48:00:00
>>> #SBATCH --output=nas.out.1
>>> #
>>> ## Command(s) to run (example):
>>> module use $HOME/easybuild/modules/all/Core
>>> module load GCC/5.4.0-2.26 OpenMPI/1.10.3
>>> mpirun -np 64  ./ep.C.64
>>>
>>> the job crashes
>>>
>>> Using easybuild, these are my config options for ompi —
>>>
>>> configopts = '--with-threads=posix --enable-shared
>>> --enable-mpi-thread-multiple --with-verbs '
>>> configopts += '--enable-mpirun-prefix-by-default '  # suppress failure
>>> modes in relation to mpirun path
>>> configopts += '--with-hwloc=$EBROOTHWLOC '  # hwloc support
>>> configopts += '--disable-dlopen '  # statically link component, don't do
>>> dynamic loading
>>> configopts += '--with-slurm --with-pmi ‘
>>>
>>> And finally —
>>>
>>> $ ldd /opt/local/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/bin/orterun
>>> | grep pmi
>>>         libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x00007f0129d6d000)
>>>         libpmi2.so.0 => /usr/lib64/libpmi2.so.0 (0x00007f0129b51000)
>>>
>>> $ ompi_info | grep pmi
>>>                   MCA db: pmi (MCA v2.0.0, API v1.0.0, Component v1.10.3)
>>>                  MCA ess: pmi (MCA v2.0.0, API v3.0.0, Component v1.10.3)
>>>              MCA grpcomm: pmi (MCA v2.0.0, API v2.0.0, Component v1.10.3)
>>>               MCA pubsub: pmi (MCA v2.0.0, API v2.0.0, Component v1.10.3)
>>>
>>>
>>> Any suggestions?
>>> _______________
>>> Gedaliah Wolosh
>>> IST Academic and Research Computing Systems (ARCS)
>>> NJIT
>>> GITC 2203
>>> 973 596 5437 <(973)%20596-5437>
>>> gwolosh at njit.edu
>>>
>>>
>>
>>
>> --
>> С Уважением, Поляков Артем Юрьевич
>> Best regards, Artem Y. Polyakov
>>
>
>
>
> --
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov
>
>
>


-- 
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171207/a985cfc6/attachment-0001.html>


More information about the slurm-users mailing list