[slurm-users] How to trigger kernel stacktraces for stuck processes from unkillable steps
    Christopher Samuel 
    chris at csamuel.org
       
    Wed Sep 18 19:52:34 UTC 2019
    
    
  
Hi all,
At the Slurm User Group I mentioned about how to tell the kernel to dump 
information about stuck processes from your unkillable step script to 
the kernel log buffer (seen via dmesg and hopefully syslog'd somewhere 
useful for you).
echo w > /proc/sysrq-trigger
That's it.. ;-)  You probably want to echo something useful to /dev/kmsg 
beforehand to say what the job ID was that triggered it too.
The 'echo' will block until the kernel completes the writes, which if 
you've got a lot stuck may be few seconds.
Hope this is useful!
All the best,
Chris
-- 
   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
    
    
More information about the slurm-users
mailing list