[slurm-users] Slurm - UnkillableStepProgram
Yap, Mike
M.Yap at massey.ac.nz
Tue Mar 23 02:12:36 UTC 2021
Hi All
Have been reading on the archive hoping to implement unkillablesteptimeout and unkillablesteprogram to the slurm
But I'm kind of confuse with it application
1. I presume UnkillableStepTimeout is set in slurm.conf. and it act as a timer to trigger UnkillableStepProgram
2. UnkillableStepProgram can be use to send email or reboot compute node - question is how do we configure it ?
scontrol show config | grep -i kill
KillOnBadExit = 1
KillWait = 30 sec
UnkillableStepProgram = (null)
UnkillableStepTimeout = 300 sec
Please advise
Thanks
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210323/82bd89b9/attachment.htm>
More information about the slurm-users
mailing list