[slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

Artem Polyakov artpol84 at gmail.com
Thu Dec 7 14:31:52 MST 2017


You seem to use a very old OMPI implementation (the current one is 3.0). So
I'd suggest to try it if you can.
And it seem like a pure OMPI problem so OMPI dev list may be more
appropriate for this topic.


2017-12-07 12:53 GMT-08:00 Glenn (Gedaliah) Wolosh <gwolosh at njit.edu>:

>
>
> On Dec 7, 2017, at 3:26 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
>
> Given that ring is working I don't think that it's a PMI problem.
>
> Can you try running NPB with the tcp btl parameters that I've provided? (I
> assume you have TCP interconnect, let me know if it's not a case).
>
>
> чт, 7 дек. 2017 г. в 12:03, Glenn (Gedaliah) Wolosh <gwolosh at njit.edu>:
>
>> On Dec 7, 2017, at 1:18 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
>>
>> Couple of things to try to locate the issue:
>>
>> 1. To make sure that PMI is not working: have you tried to run something
>> simple (like hello_world (https://github.com/open-mpi/
>> ompi/blob/master/examples/hello_c.c) and ring (
>> https://github.com/open-mpi/ompi/blob/master/examples/ring_c.c). Please
>> try to run those two and post the results.
>> 2. If hello is working and ring is not can you try to change the fabric
>> to TCP:
>> $ export OMPI_MCA_btl=tcp,self
>> $ export OMPI_MCA_pml=ob1
>> $ srun ...
>>
>> Please provide the outputs
>>
>>
>
> export OMPI_MCA_btl=tcp,self
> export OMPI_MCA_pml=ob1
>
> srun --nodes=8 --ntasks-per-node=8 --ntasks=64 --mpi=pmi2 ./ep.C.64
>
> This works —
>
> AS Parallel Benchmarks 3.3 -- EP Benchmark
>
>  Number of random numbers generated:      8589934592 <(858)%20993-4592>
>  Number of active processes:                      64
>
> EP Benchmark Results:
>
> CPU Time =    5.9208
> N = 2^   32
> No. Gaussian Pairs =    3373275903 <(337)%20327-5903>.
> Sums =     4.764367927992081D+04   -8.084072988045549D+04
> Counts:
>   0    1572172634.
>   1    1501108549.
>   2     281805648.
>   3      17761221.
>   4        424017.
>   5          3821.
>   6            13.
>   7             0.
>   8             0.
>   9             0.
>
>
>  EP Benchmark Completed.
>  Class           =                        C
>  Size            =               8589934592 <(858)%20993-4592>
>  Iterations      =                        0
>  Time in seconds =                     5.92
>  Total processes =                       64
>  Compiled procs  =                       64
>  Mop/s total     =                  1450.82
>  Mop/s/process   =                    22.67
>  Operation type  = Random numbers generated
>  Verification    =               SUCCESSFUL
>  Version         =                    3.3.1
>  Compile date    =              07 Dec 2017
>
>  Compile options:
>     MPIF77       = mpif77
>     FLINK        = $(MPIF77)
>     FMPI_LIB     = -L/opt/local/easybuild/software/Compiler/GC...
>     FMPI_INC     = -I/opt/local/easybuild/software/Compiler/GC...
>     FFLAGS       = -O
>     FLINKFLAGS   = -O
>     RAND         = randi8
>
>
>  Please send feedbacks and/or the results of this run to:
>
>  NPB Development Team
>  Internet: npb at nas.nasa.gov
>
> Hmm...
>
> srun --mpi=pmi2 --ntasks-per-node=8 --ntasks=16 ./hello_c > hello_c.out
>>
>> Hello, world, I am 24 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 0 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 25 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 1 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 27 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 2 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 29 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 31 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 30 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 4 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 5 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 17 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 3 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 7 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 6 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 18 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 22 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 23 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 19 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 9 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 20 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 8 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 10 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 13 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 11 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 26 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 16 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 14 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 28 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 21 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 15 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>> Hello, world, I am 12 of 32, (Open MPI v1.10.3, package: Open MPI
>> gwolosh at snode2.p-stheno.tartan.njit.edu Distribution, ident: 1.10.3,
>> repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 150)
>>
>>  srun --mpi=pmi2 --ntasks-per-node=8 --ntasks=16 --nodes=2 ./ring_c >
>> ring_c.out
>>
>> Process 1 exiting
>> Process 12 exiting
>> Process 14 exiting
>> Process 13 exiting
>> Process 3 exiting
>> Process 11 exiting
>> Process 5 exiting
>> Process 6 exiting
>> Process 2 exiting
>> Process 4 exiting
>> Process 9 exiting
>> Process 10 exiting
>> Process 7 exiting
>> Process 15 exiting
>> Process 0 sending 10 to 1, tag 201 (16 processes in ring)
>> Process 0 sent to 1
>> Process 0 decremented value: 9
>> Process 0 decremented value: 8
>> Process 0 decremented value: 7
>> Process 0 decremented value: 6
>> Process 0 decremented value: 5
>> Process 0 decremented value: 4
>> Process 0 decremented value: 3
>> Process 0 decremented value: 2
>> Process 0 decremented value: 1
>> Process 0 decremented value: 0
>> Process 0 exiting
>> Process 8 exiting
>>
>>
>> 2017-12-07 10:05 GMT-08:00 Glenn (Gedaliah) Wolosh <gwolosh at njit.edu>:
>>
>>>
>>>
>>> On Dec 7, 2017, at 12:51 PM, Artem Polyakov <artpol84 at gmail.com> wrote:
>>>
>>> also please post the output of
>>> $ srun --mpi=list
>>>
>>>
>>> [gwolosh at p-slogin bin]$ srun --mpi=list
>>> srun: MPI types are...
>>> srun: mpi/mpich1_shmem
>>> srun: mpi/mpich1_p4
>>> srun: mpi/lam
>>> srun: mpi/openmpi
>>> srun: mpi/none
>>> srun: mpi/mvapich
>>> srun: mpi/mpichmx
>>> srun: mpi/pmi2
>>> srun: mpi/mpichgm
>>>
>>>
>>>
>>> When job crashes - is there any error messages in the relevant
>>> slurmd.log's or output on the screen?
>>>
>>>
>>> on screen —
>>>
>>> [snode4][[274,1],24][connect/btl_openib_connect_udcm.c:
>>> 1448:udcm_wait_for_send_completion] send failed with verbs status 2
>>> [snode4:5175] *** An error occurred in MPI_Bcast
>>> [snode4:5175] *** reported by process [17956865,24]
>>> [snode4:5175] *** on communicator MPI_COMM_WORLD
>>> [snode4:5175] *** MPI_ERR_OTHER: known error not in list
>>> [snode4:5175] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
>>> will now abort,
>>> [snode4:5175] ***    and potentially your MPI job)
>>> mlx4: local QP operation err (QPN 0005f3, WQE index 40000, vendor
>>> syndrome 6c, opcode = 5e)
>>> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
>>> [snode4][[274,1],31][connect/btl_openib_connect_udcm.c:
>>> 1448:udcm_wait_for_send_completion] send failed with verbs status 2
>>> slurmstepd: error: *** STEP 274.0 ON snode1 CANCELLED AT
>>> 2017-12-07T12:55:46 ***
>>> [snode4:5182] *** An error occurred in MPI_Bcast
>>> [snode4:5182] *** reported by process [17956865,31]
>>> [snode4:5182] *** on communicator MPI_COMM_WORLD
>>> [snode4:5182] *** MPI_ERR_OTHER: known error not in list
>>> [snode4:5182] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
>>> will now abort,
>>> [snode4:5182] ***    and potentially your MPI job)
>>> mlx4: local QP operation err (QPN 0005f7, WQE index 40000, vendor
>>> syndrome 6c, opcode = 5e)
>>> [snode4][[274,1],27][connect/btl_openib_connect_udcm.c:
>>> 1448:udcm_wait_for_send_completion] send failed with verbs status 2
>>> [snode4:5178] *** An error occurred in MPI_Bcast
>>> [snode4:5178] *** reported by process [17956865,27]
>>> [snode4:5178] *** on communicator MPI_COMM_WORLD
>>> [snode4:5178] *** MPI_ERR_OTHER: known error not in list
>>> [snode4:5178] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
>>> will now abort,
>>> [snode4:5178] ***    and potentially your MPI job)
>>> mlx4: local QP operation err (QPN 0005fa, WQE index 40000, vendor
>>> syndrome 6c, opcode = 5e)
>>> srun: error: snode4: tasks 24,31: Exited with exit code 16
>>> srun: error: snode4: tasks 25-30: Killed
>>> srun: error: snode5: tasks 32-39: Killed
>>> srun: error: snode3: tasks 16-23: Killed
>>> srun: error: snode8: tasks 56-63: Killed
>>> srun: error: snode7: tasks 48-55: Killed
>>> srun: error: snode1: tasks 0-7: Killed
>>> srun: error: snode2: tasks 8-15: Killed
>>> srun: error: snode6: tasks 40-47: Killed
>>>
>>> Nothing striking in the slurmd log
>>>
>>>
>>>
>>> 2017-12-07 9:49 GMT-08:00 Artem Polyakov <artpol84 at gmail.com>:
>>>
>>>> Hello,
>>>>
>>>> what is the value of MpiDefault option in your Slurm configuration file?
>>>>
>>>> 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh <gwolosh at njit.edu>:
>>>>
>>>>> Hello
>>>>>
>>>>> This is using Slurm version - 17.02.6 running on Scientific Linux
>>>>> release 7.4 (Nitrogen)
>>>>>
>>>>> [gwolosh at p-slogin bin]$ module li
>>>>>
>>>>> Currently Loaded Modules:
>>>>>   1) GCCcore/.5.4.0 (H)   2) binutils/.2.26 (H)   3) GCC/5.4.0-2.26
>>>>> 4) numactl/2.0.11   5) hwloc/1.11.3   6) OpenMPI/1.10.3
>>>>>
>>>>> If I run
>>>>>
>>>>> srun --nodes=8 --ntasks-per-node=8 --ntasks=64  ./ep.C.64
>>>>>
>>>>> It runs successfuly but I get a message —
>>>>>
>>>>> PMI2 initialized but returned bad values for size/rank/jobid.
>>>>> This is symptomatic of either a failure to use the
>>>>> "--mpi=pmi2" flag in SLURM, or a borked PMI2 installation.
>>>>> If running under SLURM, try adding "-mpi=pmi2" to your
>>>>> srun command line. If that doesn't work, or if you are
>>>>> not running under SLURM, try removing or renaming the
>>>>> pmi2.h header file so PMI2 support will not automatically
>>>>> be built, reconfigure and build OMPI, and then try again
>>>>> with only PMI1 support enabled.
>>>>>
>>>>> If I run
>>>>>
>>>>> srun --nodes=8 --ntasks-per-node=8 --ntasks=64  —mpi=pmi2 ./ep.C.64
>>>>>
>>>>> The job crashes
>>>>>
>>>>> If I run via sbatch —
>>>>>
>>>>> #!/bin/bash
>>>>> # Job name:
>>>>> #SBATCH --job-name=nas_bench
>>>>> #SBATCH --nodes=8
>>>>> #SBATCH --ntasks=64
>>>>> #SBATCH --ntasks-per-node=8
>>>>> #SBATCH --time=48:00:00
>>>>> #SBATCH --output=nas.out.1
>>>>> #
>>>>> ## Command(s) to run (example):
>>>>> module use $HOME/easybuild/modules/all/Core
>>>>> module load GCC/5.4.0-2.26 OpenMPI/1.10.3
>>>>> mpirun -np 64  ./ep.C.64
>>>>>
>>>>> the job crashes
>>>>>
>>>>> Using easybuild, these are my config options for ompi —
>>>>>
>>>>> configopts = '--with-threads=posix --enable-shared
>>>>> --enable-mpi-thread-multiple --with-verbs '
>>>>> configopts += '--enable-mpirun-prefix-by-default '  # suppress
>>>>> failure modes in relation to mpirun path
>>>>> configopts += '--with-hwloc=$EBROOTHWLOC '  # hwloc support
>>>>> configopts += '--disable-dlopen '  # statically link component, don't
>>>>> do dynamic loading
>>>>> configopts += '--with-slurm --with-pmi ‘
>>>>>
>>>>> And finally —
>>>>>
>>>>> $ ldd /opt/local/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/bin/orterun
>>>>> | grep pmi
>>>>>         libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x00007f0129d6d000)
>>>>>         libpmi2.so.0 => /usr/lib64/libpmi2.so.0 (0x00007f0129b51000)
>>>>>
>>>>> $ ompi_info | grep pmi
>>>>>                   MCA db: pmi (MCA v2.0.0, API v1.0.0, Component
>>>>> v1.10.3)
>>>>>                  MCA ess: pmi (MCA v2.0.0, API v3.0.0, Component
>>>>> v1.10.3)
>>>>>              MCA grpcomm: pmi (MCA v2.0.0, API v2.0.0, Component
>>>>> v1.10.3)
>>>>>               MCA pubsub: pmi (MCA v2.0.0, API v2.0.0, Component
>>>>> v1.10.3)
>>>>>
>>>>>
>>>>> Any suggestions?
>>>>> _______________
>>>>> Gedaliah Wolosh
>>>>> IST Academic and Research Computing Systems (ARCS)
>>>>> NJIT
>>>>> GITC 2203
>>>>> 973 596 5437 <(973)%20596-5437>
>>>>> gwolosh at njit.edu
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> С Уважением, Поляков Артем Юрьевич
>>>> Best regards, Artem Y. Polyakov
>>>>
>>>
>>>
>>>
>>> --
>>> С Уважением, Поляков Артем Юрьевич
>>> Best regards, Artem Y. Polyakov
>>>
>>>
>>>
>>
>>
>> --
>> С Уважением, Поляков Артем Юрьевич
>> Best regards, Artem Y. Polyakov
>>
>> --
> ----- Best regards, Artem Polyakov (Mobile mail)
>
>
>


-- 
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171207/28e9eece/attachment-0001.html>


More information about the slurm-users mailing list