[slurm-users] How to trigger kernel stacktraces for stuck processes from unkillable steps

Christopher Samuel chris at csamuel.org
Wed Sep 18 19:52:34 UTC 2019


Hi all,

At the Slurm User Group I mentioned about how to tell the kernel to dump 
information about stuck processes from your unkillable step script to 
the kernel log buffer (seen via dmesg and hopefully syslog'd somewhere 
useful for you).

echo w > /proc/sysrq-trigger

That's it.. ;-)  You probably want to echo something useful to /dev/kmsg 
beforehand to say what the job ID was that triggered it too.

The 'echo' will block until the kernel completes the writes, which if 
you've got a lot stuck may be few seconds.

Hope this is useful!

All the best,
Chris
-- 
   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



More information about the slurm-users mailing list