[slurm-users] Slurm strigger configuration

Jodie H. Sprouse jhs43 at cornell.edu
Thu Sep 20 13:59:56 MDT 2018


Thank you to both Kilian and Chris, 
I have running on the slurm server to report once when any of the nodes go into “Drain” State:

sudo -u slurm bash -c “strigger --set -D -p /etc/slurm/triggers/slurm_admin_notify --flags=perm"
/bin/mail -s “ClusterName DrainedNode:$*”  our_admin_email_address

Exactly what we needed. Success. 
Thank you!
Jodie
Center for Advanced Computing
Cornell University


On Sep 19, 2018, at 5:14 PM, Christopher Benjamin Coffey <Chris.Coffey at nau.edu> wrote:

Killian, thank you very much! Never noticed the perm flag!

Best,
Chris

—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167


On 9/19/18, 10:01 AM, "slurm-users on behalf of Kilian Cavalotti" <slurm-users-bounces at lists.schedmd.com on behalf of kilian.cavalotti.work at gmail.com> wrote:

   On Wed, Sep 19, 2018 at 9:21 AM Christopher Benjamin Coffey
   <Chris.Coffey at nau.edu> wrote:
> The only thing that I've gotten working so far is this:
> sudo -u slurm bash -c "strigger --set -D -n cn15 -p /common/adm/slurm/triggers/nodestatus"
> 
> So, that will run the nodestatus script which emails when the node cn15 gets set into drain state. What I'd like to do, which I haven't put time into figuring out, is how to setup a persistent trigger that can run when ANY node goes into drain state. Let me know if you figure that out. As you can see above, the trigger has to be setup by the slurm user.

   strigger takes a "--flags=perm" option, which makes the trigger
   permanent and doesn't purge it after the event happened. So it doesn't
   need tb re-armed in the trigger script for the next event.

   Also, not specifying any job id nor node name in the strigger command
   will make the trigger apply to all nodes. That's how we set our
   default "down" trigger for all nodes on our cluster:
   # su -s /bin/bash -c "strigger --set --down
   --program=/share/admin/scripts/slurm/triggers/down.sh --flags=perm"
   slurm

   Also note that the script given to "--program" needs to be executable
   by user Slurm on the controller node(s).

   Cheers,
   -- 
   Kilian






More information about the slurm-users mailing list