[slurm-users] job not running because of "Resources", but resources are available
Bernstein, Noam CIV USN NRL (6393) Washington DC (USA)
noam.bernstein at nrl.navy.mil
Sat Mar 20 00:12:40 UTC 2021
Can anyone explain why job 1908239 is not running, or what else I can check? squeue says "Resources", and start time is always right now, no matter when I run "squeue --start", but the resources are available according to "sinfo ... state=idle". It's only a 1 minute job, so it's not because the nodes won't be available for long enough to be backfilled.
slurm version is admittedly a bit old, 19.05.7
> squeue -p n2019 --state=PD -l
Fri Mar 19 20:09:17 2021
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
1908239 n2019 LiCu_SPA bernstei PENDING 0:00 1:00 1 (Resources)
1908236 n2019 cspbbr3- jllyons PENDING 0:00 2-16:00:00 2 (Priority)
1908227 n2019 Cy3_dupl yckim PENDING 0:00 33-08:00:00 4 (Priority)
1908231 n2019,n20 sGC_Fe_N bernstei PENDING 0:00 7-00:00:00 4 (JobHeldUser)
1908238 n2019 LiCu_SPA bernstei PENDING 0:00 1:00:00 1 (JobHeldUser)
> squeue -j 1908239 --start
JOBID PARTITION NAME USER ST START_TIME NODES SCHEDNODES NODELIST(REASON)
1908239 n2019 LiCu_SPA bernstei PD 2021-03-19T20:09:17 1 compute-4-[18-19] (Resources)
> sinfo -p n2019 state=idle
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
n2019 up infinite 43 alloc compute-4-[0-11,13-17,20-26,28-39,41-47]
n2019 up infinite 5 idle compute-4-[12,18-19,27,40]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210320/4986cc79/attachment-0001.htm>
More information about the slurm-users
mailing list