[slurm-users] Jobs cancelled due to job requeue

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Sat Sep 3 07:59:13 UTC 2022


On 02-09-2022 20:52, Nicolas Sonoda wrote:
> I'm submiting a job but after a few seconds it got cancelled and the 
> Slurm output file show this message:
> 
> slurmstepd: error: *** JOB 23883 ON gn01 CANCELLED AT 
> 2022-09-02T14:28:19 DUE TO JOB REQUEUE ***
> 
> After this the job turn into PD state on queue, with the reason: BeginTime:
> 
> JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
> 23884       gpu Memb.LS1     vhpc PD       0:00      1 (BeginTime)
> 
> And after a while the job stay on RH state with JobHoldMaxRequeue reason.
> 
> I'm attaching my script and input files.
> 
> Can you help me with that?

You could look in the slurmctld.log file and the node's slurmd.log file 
to see what they say about the job.

Check your slurm.conf requeue configuration:

$ scontrol show config | grep Requeue

/Ole



More information about the slurm-users mailing list