[slurm-users] Execute scripts on suspend and cancel
Brian Andrus
toomuchit at gmail.com
Wed Oct 16 14:20:12 UTC 2019
Typo on my part. Yes, it is SIGSTOP, which cannot be caught..
Maybe you could, "pre-signal" any running job with SIGUSR1 before
sending the suspend command. At least, if you are manually suspending
the job(s). That could be caught and acted upon before the SIGSTOP was
received.
Brian
On 10/15/2019 10:52 PM, Oytun Peksel wrote:
>
> Brian,
>
> Thanks for your response. I am looking into that option. I am a bit
> confused about which signal is sent though. I thought it was SIGSTOP
> not SIGSTP. And I read you can’t really catch and stop SIGSTOP or
> SIGCONT signals but I am not very good at sys admin stuff anyway.
>
> So in the end, these feel like dirty tricks to me. The select/*
> plugins should have mechanisms to run scripts and such before sending
> signals. But apparently there is no such mechanism.
>
> So probably I will dig deeper into what you suggested.
>
> Thanks
>
>
>
> *Oytun Peksel*
>
> oytun.peksel at semcon.com <mailto:oytun.peksel at semcon.com>
>
>
>
>
> +46739205917
>
>
>
>
> *From:*slurm-users <slurm-users-bounces at lists.schedmd.com> *On Behalf
> Of *Brian Andrus
> *Sent:* den 15 oktober 2019 20:58
> *To:* slurm-users at lists.schedmd.com
> *Subject:* Re: [slurm-users] Execute scripts on suspend and cancel
>
> It seems that there are some details that would need addressed.
>
> A suspend signal is nothing more than sending a SIGSTP (like hitting
> ctrl-s), so the application is still in memory awaiting SIGCONT
>
> So what should happen when it continues and there are no more
> licenses? So the proper place for what you are looking for is in the
> application itself. If it is given a SIGSTP, it could release the
> licenses and then check them out again when SIGCONT is received.
>
> If you are able to tell your app to release/request a license
> externally, you may want to have a wrapper to do the signal handling
> until they have it as part of their app.
>
> Brian Andrus
>
> On 10/14/2019 4:40 AM, Oytun Peksel wrote:
>
> It is quite weird if slurm has no mechanism as described. I have
> been digging more into it and someone suggested a workaround using
> mail notifications. You use a script instead of the mail
> application and catch the event then use use sacct to see what is
> happening.
>
> Two problems with this:
>
> ·There is no mail sent with suspended preemption
>
> ·If you use requeue instead there will be a mail event and you can
> catch it. Sacct will flag it as “preempted” so you know it is
> requeued. But then it would change it pending. So you really need
> to be quick to catch it. Also there is no distinctive flag for
> resuming.
>
> Anyone has any other method to execute scripts during preemption?
>
>
>
>
> *Oytun Peksel*
>
> oytun.peksel at semcon.com <mailto:oytun.peksel at semcon.com>
>
>
>
>
> +46739205917
>
>
>
>
> *From:*slurm-users <slurm-users-bounces at lists.schedmd.com>
> <mailto:slurm-users-bounces at lists.schedmd.com>*On Behalf Of *Oytun
> Peksel
> *Sent:* den 11 oktober 2019 09:10
> *To:* slurm-users at lists.schedmd.com
> <mailto:slurm-users at lists.schedmd.com>
> *Subject:* [slurm-users] Execute scripts on suspend and cancel
>
> Hi,
>
> I was wondering is there an option in Slurm to execute custom
> scripts before Suspend signal. What I need to do is to tell an
> application to release it’s licenses before sending the suspend
> signal during preemption. I think went through all the
> documentation but could not find a mechanism like this.
>
> BR
>
> /Oytun
>
>
>
> /When you communicate with us or otherwise interact with Semcon,
> we will process personal data that you provide to us or we collect
> about you, please read more in our //Privacy Policy/
> <https://semcon.com/data-privacy-policy/>/./
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191016/60144e0f/attachment.htm>
More information about the slurm-users
mailing list