[slurm-users] jobs not running with srun
Prentice Bisbal
pbisbal at pppl.gov
Fri Apr 24 18:33:01 UTC 2020
We are in the process of upgrading to CentOS 7, and have built Slurm
19.05.5 and OpenMPI 4.0.3 for CentOS 7. When I submit that launches
using srun, the job appears to be running according to squeue, (state =
R), but the program doesn't do anything. I'm testing with a simple
Hello, World program that I've been using for years.
This same program runs just fine when I launch it with mpiexec or mpirun
instead of srun. Any ideas of what's wrong?
I was originally getting a different failure, but over on the OpenMPI
list, where I was told that Slurm wasn't compiled with PMIx support, so
we rebuilt Slurm with PMIx support, and that's when the jobs starting
"running" and just hanging. Any ideas of what's wrong or how to debug this?
THanks,
Prentice
More information about the slurm-users
mailing list