[slurm-users] jobs stuck in ReqNodeNotAvail,
Merlin Hartley
merlin-slurm at mrc-mbu.cam.ac.uk
Wed Nov 29 09:08:19 MST 2017
Can you give us the output of
# control show job 6982
Could be an issue with requesting too many CPUs or something…
Merlin
--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom
> On 29 Nov 2017, at 15:21, Christian Anthon <anthon at rth.dk> wrote:
>
> Hi,
>
> I have a problem with a newly setup slurm-17.02.7-1.el6.x86_64 that jobs seems to be stuck in ReqNodeNotAvail:
>
> 6982 panic Morgens ferro PD 0:00 1 (ReqNodeNotAvail, UnavailableNodes:)
> 6981 panic SPEC ferro PD 0:00 1 (ReqNodeNotAvail, UnavailableNodes:)
>
> The nodes are fully allocated in terms of memory, but not all cpu resources are consumed
>
> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
> _default up infinite 19 mix clone[05-11,25-29,31-32,36-37,39-40,45]
> _default up infinite 11 alloc alone[02-08,10-13]
> fastlane up infinite 19 mix clone[05-11,25-29,31-32,36-37,39-40,45]
> fastlane up infinite 11 alloc alone[02-08,10-13]
> panic up infinite 19 mix clone[05-11,25-29,31-32,36-37,39-40,45]
> panic up infinite 12 alloc alone[02-08,10-13,15]
> free* up infinite 19 mix clone[05-11,25-29,31-32,36-37,39-40,45]
> free* up infinite 11 alloc alone[02-08,10-13]
>
> Possibly relevant lines in slurm.conf (full slurm.conf attached)
>
> SchedulerType=sched/backfill
> SelectType=select/cons_res
> SelectTypeParameters=CR_CPU_Memory
> TaskPlugin=task/none
> FastSchedule=1
>
> Any advice?
>
> Cheers, Christian.
>
> <slurm.conf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/37ddba41/attachment.html>
More information about the slurm-users
mailing list