[slurm-users] Jobs stuck with BeginTime and prolog exit status 99:0
Chandler
admin at genome.arizona.edu
Tue May 17 16:27:11 UTC 2022
Could you help me figure out why our jobs are stuck PD because of BeginTime? e.g:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
24458 defq cromwell smrtanal PD 0:00 1 (BeginTime)
# scontrol show job 24458
JobId=24458 JobName=cromwell_d72d675a_dataset_filter
UserId=smrtanalysis(1002) GroupId=smrtanalysis(1002) MCS_label=N/A
Priority=4294892709 Nice=0 Account=(null) QOS=normal
JobState=PENDING Reason=BeginTime Dependency=(null)
Requeue=1 Restarts=784 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2022-05-17T09:23:03 EligibleTime=2022-05-17T09:25:04
AccrueTime=2022-05-17T09:25:04
StartTime=2022-05-17T09:25:04 EndTime=Unknown Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-05-17T09:23:03
Partition=defq AllocNode:Sid=EagI:2725352
ReqNodeList=(null) ExcNodeList=(null)
NodeList=(null)
BatchHost=EagI
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=(null)
WorkDir=/data2/pacbio/smrtlink/jobs
StdErr=/data2/pacbio/smrtlink/jobs/cromwell-executions/pb_export_ccs/441e90d6-263b-41a5-bbbb-5009d9a346d9/call-prepare_input/prepare_input/d72d675a-4df7-4e9f-8072-e722742f48e7/call-dataset_filter/execution/stderr
StdIn=/dev/null
StdOut=/data2/pacbio/smrtlink/jobs/cromwell-executions/pb_export_ccs/441e90d6-263b-41a5-bbbb-5009d9a346d9/call-prepare_input/prepare_input/d72d675a-4df7-4e9f-8072-e722742f48e7/call-dataset_filter/execution/stdout
Power=
#
/var/log/slurmctld:
[2022-05-17T09:20:44.366] Requeuing JobId=24458
[2022-05-17T09:23:03.068] backfill: Started JobId=24458 in defq on EagI
[2022-05-17T09:23:03.106] error: prolog_slurmctld JobId=24458 prolog exit status 99:0
[2022-05-17T09:23:03.114] Requeuing JobId=24458
Thanks
--
Chandler Sobel-Sorenson (he/him) / Systems Administrator
Arizona Genomics Institute
www.genome.arizona.edu
More information about the slurm-users
mailing list