[slurm-users] Execute scripts on suspend and cancel

Oytun Peksel Oytun.Peksel at semcon.com
Wed Oct 16 05:52:37 UTC 2019


Brian,

Thanks for your response. I am looking into that option. I am a bit confused about which signal is sent though. I thought it was SIGSTOP not SIGSTP. And I read you can't really catch and stop SIGSTOP or SIGCONT signals but I am not very good at sys admin stuff anyway.

So in the end, these feel like dirty tricks to me. The select/* plugins should have  mechanisms to run scripts and such before sending signals. But apparently there is no such mechanism.

So probably I will dig deeper into what you suggested.

Thanks



Oytun Peksel

oytun.peksel at semcon.com <mailto:oytun.peksel at semcon.com>

+46739205917


From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Brian Andrus
Sent: den 15 oktober 2019 20:58
To: slurm-users at lists.schedmd.com
Subject: Re: [slurm-users] Execute scripts on suspend and cancel


It seems that there are some details that would need addressed.

A suspend signal is nothing more than sending a SIGSTP (like hitting ctrl-s), so the application is still in memory awaiting SIGCONT

So what should happen when it continues and there are no more licenses? So the proper place for what you are looking for is in the application itself. If it is given a SIGSTP, it could release the licenses and then check them out again when SIGCONT is received.

If you are able to tell your app to release/request a license externally, you may want to have a wrapper to do the signal handling until they have it as part of their app.

Brian Andrus


On 10/14/2019 4:40 AM, Oytun Peksel wrote:
It is quite weird if slurm has no mechanism as described. I have been digging more into it and someone suggested a workaround using mail notifications. You use a script instead of the mail application and catch the event then use use sacct to see what is happening.

Two problems with this:

*        There is no mail sent with suspended preemption

*        If you use requeue instead there will be a mail event and you can catch it. Sacct will flag it as "preempted" so you know it is requeued. But then it would change it pending. So you really need to be quick to catch it. Also there is no distinctive flag for resuming.


Anyone has any other method to execute scripts during preemption?




Oytun Peksel

oytun.peksel at semcon.com <mailto:oytun.peksel at semcon.com>

+46739205917


From: slurm-users <slurm-users-bounces at lists.schedmd.com><mailto:slurm-users-bounces at lists.schedmd.com> On Behalf Of Oytun Peksel
Sent: den 11 oktober 2019 09:10
To: slurm-users at lists.schedmd.com<mailto:slurm-users at lists.schedmd.com>
Subject: [slurm-users] Execute scripts on suspend and cancel

Hi,

I was wondering is there an option in Slurm to execute custom scripts before Suspend signal.  What I need to do is to tell an application to release it's licenses before sending the suspend signal during preemption. I think went through all the documentation but could not find a mechanism like this.

BR
/Oytun


When you communicate with us or otherwise interact with Semcon, we will process personal data that you provide to us or we collect about you, please read more in our Privacy Policy<https://semcon.com/data-privacy-policy/>.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191016/d6dc2db2/attachment-0001.htm>


More information about the slurm-users mailing list