[slurm-users] Larger jobs tend to get starved out on our cluster
Chris Samuel
chris at csamuel.org
Wed Jan 16 22:30:39 MST 2019
On 16/1/19 9:04 am, Baker D.J. wrote:
> Thank you for your reply regarding OpenMPI and srun. When I try to run
> an mpi program using srun I find the following..
>
>
> red[036-037]
> [red036.cluster.local:308110] PMI_Init [pmix_s1.c:168:s1_init]: PMI is
> not initialized
> [red036.cluster.local:308107] PMI_Init [pmix_s1.c:168:s1_init]: PMI is
> not initialized
> [red036.cluster.local:308111] PMI_Init [pmix_s1.c:168:s1_init]: PMI is
> not initialized
> [red036.cluster.local:308101] PMI_Init [pmix_s1.c:168:s1_init]: PMI is
> not initialized
> [red036.cluster.local:308105] PMI_Init [pmix_s1.c:168:s1_init]: PMI is
> not initialized
So this looks like it's trying to use PMI1.
What do the following say?
srun --mpi=list
scontrol show config | fgrep -i mpidefault
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list