[slurm-users] How to trap a SIGINT signal in a child process of a batch ?
Jean-mathieu CHANTREIN
jean-mathieu.chantrein at univ-angers.fr
Tue Apr 21 07:42:51 UTC 2020
Hello,
I'm using slurm version 19.05.2 on debian 10.
I'm try to hand a SIGINT signal by a child process of a batch.
The signal is automatically send 30 s before the end of time.
You can see this mechanism in this minimal example:
---------------------------------------
test.slurm:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --ntasks-per-node=1
#SBATCH --nodes=1
#SBATCH --time=00:03:00
#SBATCH --signal=B:SIGINT at 30
# This example works, but I need it to work without "B:" in --signal options, so I want test.sh receives the SIGINT signal and not test.slurm
sig_handler()
{
echo "BATCH interrupted"
exit 2
}
trap 'sig_handler' SIGINT
/home/user/test.sh &
wait
---------------------------------------
test.sh:
#!/bin/bash
function sig_handler()
{
echo "Executable interrupted"
exit 2
}
trap 'sig_handler' SIGINT
echo "BEGIN"
sleep 200
echo "END"
---------------------------------------
Unfortunately, when I use in test.slurm:
#SBATCH --signal=SIGINT at 30
It seems that the signal SIGINT is not received.
I was try to debug with the use of scancel like this:
scancel --signal=SIGINT IDJOB
without success. In this way, only SIGKILL signals are received but a SIGKILL signal can't be trap.
In [ https://slurm.schedmd.com/scancel.html, | https://slurm.schedmd.com/scancel.html, ] we can see in -b option, but seems to be the case even without -b option:
By default, signals other than SIGKILL are not sent to the batch step
How to change this default behavior?
Do you have the same behavior on your systems?
How would you get a SIGINT signal trapped in test.sh?
Best regards,
Jean-Mathieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200421/32665077/attachment-0001.htm>
More information about the slurm-users
mailing list