[slurm-users] Help with PMIx, Slurm, Intel MPI

Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] matthew.thompson at nasa.gov
Fri Jan 18 17:29:01 UTC 2019


All,

This is probably going to be a very basic question, but I find the need to ask. Recently the cluster I use installed UCX and PMIx, which is nice. Now I'm currently trying to build a stack of Open MPI 4.0.0 with the ability to see those, but until then I thought I'd try Intel MPI based on https://slurm.schedmd.com/mpi_guide.html#intel_mpi

First, SLURM does seem to see PMIx:

(1041)(master) $ srun --version
srun: cluster configuration lacks support for cpu binding
slurm 17.11.12
(1042)(master) $ srun --mpi=list
srun: cluster configuration lacks support for cpu binding
srun: MPI types are...
srun: pmi2
srun: none
srun: openmpi
srun: pmix
srun: pmix_v2

And I can run fine with mpirun (I've already salloc'd some nodes) and this is always how I run with Intel MPI:

(1051)(master) $ mpirun -np 4 ./helloWorld.mpi3.SLES12.IMPI.exe
Compiler Version: Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.1.144 Build 20181018
MPI Version: 3.1
MPI Library Version: Intel(R) MPI Library 2019 Update 1 for Linux* OS

Process    0 of    4 is on borgc129
Process    1 of    4 is on borgc129
Process    2 of    4 is on borgc129
Process    3 of    4 is on borgc129

But I seem to have issues when I try to use Intel MPI and srun it just halts for a minute or so with:

(1059)(master) $ env I_MPI_PMI_LIBRARY=/usr/nlocal/pmix/2.1/lib64/libpmi2.so srun -n 4 ./helloWorld.mpi3.SLES12.IMPI.exe
srun: cluster configuration lacks support for cpu binding
srun: Warning: can't run 4 processes on 8 nodes, setting nnodes to 4

and then I see:

srun: Job 36007416 step creation temporarily disabled, retrying

So I'm doing something dumb, obviously, but do you know what?

Thanks,
Matt

-- 
Matt Thompson, SSAI, Sr Scientific Programmer/Analyst
NASA GSFC,    Global Modeling and Assimilation Office
Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
Phone: 301-614-6712                 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson



More information about the slurm-users mailing list