[slurm-users] Larger jobs tend to get starved out on our cluster

Chris Samuel chris at csamuel.org
Wed Jan 16 22:30:39 MST 2019


On 16/1/19 9:04 am, Baker D.J. wrote:

> Thank you for your reply regarding OpenMPI and srun. When I try to run 
> an mpi program using srun I find the following..
> 
> 
> red[036-037]
> [red036.cluster.local:308110] PMI_Init [pmix_s1.c:168:s1_init]: PMI is 
> not initialized
> [red036.cluster.local:308107] PMI_Init [pmix_s1.c:168:s1_init]: PMI is 
> not initialized
> [red036.cluster.local:308111] PMI_Init [pmix_s1.c:168:s1_init]: PMI is 
> not initialized
> [red036.cluster.local:308101] PMI_Init [pmix_s1.c:168:s1_init]: PMI is 
> not initialized
> [red036.cluster.local:308105] PMI_Init [pmix_s1.c:168:s1_init]: PMI is 
> not initialized

So this looks like it's trying to use PMI1.

What do the following say?

srun --mpi=list
scontrol show config | fgrep -i mpidefault

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



More information about the slurm-users mailing list