Hello,

as an example, my UnkillableStepProgram is just a bash script collecting recent logs and processes and mailing me about the error. Nothing special.

Best regards,
--
Lorenzo Bosio
Tecnico di Ricerca - Laboratorio HPC4AI
Dipartimento di Informatica


Università degli Studi di Torino
Corso Svizzera, 185 - 10149 Torino
tel. +39 340 216 8249
tel. +39 011 670 6836

Il giorno gio 18 set 2025 alle ore 12:22 Gestió Servidors via slurm-users <slurm-users@lists.schedmd.com> ha scritto:

Hi,

 

After reading answer from Ole Holm Nielsen, I have increased “MessageTimeout” to 20s (by default is 5s) and “UnkillableStepTimeout” to 150s (by default is 60s and, always 5 times larger than “MessageTimeout”). However, I have also read that UnkillableStepProgram indicates the program to use in that cases... but, by default there is no program assigned to that parameter (no program to run). So my question is if someone uses a customized “UnkillableStepProgram” and if he/she could explain that.

 

Thanks a lot!

 


--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com