[slurm-users] jobs not running with srun

Prentice Bisbal pbisbal at pppl.gov
Fri Apr 24 18:33:01 UTC 2020


We are in the process of upgrading to CentOS 7, and have built Slurm 
19.05.5 and OpenMPI 4.0.3 for CentOS 7. When I submit that launches 
using srun, the job appears to be running according to squeue, (state = 
R), but the program doesn't do anything. I'm testing with a simple 
Hello, World program that I've been using for years.

This same program runs just fine when I launch it with mpiexec or mpirun 
instead of srun. Any ideas of what's wrong?

I was originally getting a different failure, but over on the OpenMPI 
list, where I was told that Slurm wasn't compiled with PMIx support, so 
we rebuilt Slurm with PMIx support, and that's when the jobs starting 
"running" and just hanging. Any ideas of what's wrong or how to debug this?

THanks,

Prentice




More information about the slurm-users mailing list