[slurm-users] job restart :: how to find the reason

Adrian Sevcenco Adrian.Sevcenco at spacescience.ro
Wed Dec 2 11:27:05 UTC 2020


Hi! I encountered a situation when a bunch of jobs were restarted
and this is seen from Requeue=1 Restarts=1 BatchFlag=1 Reboot=0 ExitCode=0:0

So, i would like to know, how i can i find why there is a Requeue
(when there is only one partition defined) and why there is a restart ..

Thanks a lot!!!
Adrian



More information about the slurm-users mailing list