[slurm-users] Problems calling mpirun in OpenMPI-3.1.6 + slurm and OpenMPI-4.0.3+slurm environments

Jeffrey T Frey frey at udel.edu
Fri Apr 10 16:59:04 UTC 2020


Are you certain you're PATH addition is correct?  The "-np" flag is still present in a build of Open MPI 4.0.3 I just made, in fact:


$ 4.0.3/bin/mpirun 
--------------------------------------------------------------------------
mpirun could not find anything to do.

It is possible that you forgot to specify how many processes to run
via the "-np" argument.
--------------------------------------------------------------------------


Note that with the Slurm plugins present in your Open MPI build, there should be no need to use "-np" on the command line; the Slurm RAS plugin should pull such information from the Slurm runtime environment variables.  If you do use "-np" to request more CPUs that the job was allocated, you'll receive oversubscription errors (you know, unless you include mpirun flags to allow that to happen).


What if you add "which mpirun" to your job script ahead of the "mpirun" command -- does it show you /scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin/mpirun?




> On Apr 10, 2020, at 12:12 , Ravi Reddy Manumachu <ravi.manumachu at ucd.ie> wrote:
> 
> 
> Dear Slurm Users,
> 
> I am facing issues with the following combinations of OpenMPI and SLURM. I was wondering if you have faced something similar and can help me.
> 
> OpenMPI-3.1.6 and slurm 19.05.5
> OpenMPI-4.0.3 and slurm 19.05.5
> 
> I have the OpenMPI packages configured with "--with-slurm" option and installed. 
> 
>   Configure command line: '--prefix=/home/manumachu/openmpi-4.0.3/OPENMPI_INSTALL' '--with-slurm'
>                  MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v4.0.3)
>                  MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v4.0.3)
>                  MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v4.0.3)
>               MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component v4.0.3)
> 
> I am executing the sbatch script shown below:
> 
> #!/bin/bash
> #SBATCH --account=xxxxx
> #SBATCH --job-name=ompi4
> #SBATCH --output=ompi4.out
> #SBATCH --error=ompi4.err
> #SBATCH --ntasks-per-node=1
> #SBATCH --time=00:30:00
> export PATH=/scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/bin:$PATH
> export LD_LIBRARY_PATH=/scratch/manumachu/openmpi-4.0.3/OPENMPI_INSTALL/lib:$LD_LIBRARY_PATH
> mpirun -np 4 ./bcast_timing -t 1
> 
> No matter what option I give to mpirun, I get the following error:
> mpirun: Error: unknown option "-np"
> 
> I have used mpiexec also but received the same errors.
> 
> To summarize, I am not able to call mpirun from a SLURM script. I can use srun but I have no idea how to pass MCA parameters I usually give to mpirun such as, "--map-by ppr:1:socket -mca pml ob1 -mca btl tcp,self -mca coll_tuned_use_dynamic_rules 1".
> 
> Thank you for your help.
> 
> -- 
> Kind Regards
> Dr. Ravi Reddy Manumachu
> Research Fellow, School of Computer Science, University College Dublin
> Ravi Manumachu on Google Scholar, ResearchGate

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200410/ef8d1fd9/attachment.htm>


More information about the slurm-users mailing list