[slurm-users] Slurm strigger configuration

Christopher Benjamin Coffey Chris.Coffey at nau.edu
Wed Sep 19 10:19:24 MDT 2018


Hi Jodie,

The only thing that I've gotten working so far is this:

sudo -u slurm bash -c "strigger --set -D -n cn15 -p /common/adm/slurm/triggers/nodestatus"

So, that will run the nodestatus script which emails when the node cn15 gets set into drain state. What I'd like to do, which I haven't put time into figuring out, is how to setup a persistent trigger that can run when ANY node goes into drain state. Let me know if you figure that out. As you can see above, the trigger has to be setup by the slurm user.

Best,
Chris

—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
 

On 9/19/18, 8:48 AM, "slurm-users on behalf of Jodie H. Sprouse" <slurm-users-bounces at lists.schedmd.com on behalf of jhs43 at cornell.edu> wrote:

    
    
    
    Good morning. 
    I’m struggling with getting strigger working correctly. 
    My end goal sounds fairly simple: to get a mail notification if a node gets set into ‘drain’ mode. 
    
    
    The man page for strigger states it must be run by the set slurmuser which is slurm:
    #  scontrol show config | grep SlurmUser
    SlurmUser               = slurm(990)
    
    
    
    # grep slurm /etc/passwd
    slurm:x:990:984:SLURM resource manager:/etc/slurm:/sbin/nologin
    
    
    
    I created the file per the man page (I’m first trying to get it to work if a node goes down after receiving “option —drain does not exist”):
    # cat /usr/sbin/slurm_admin_notify
    
    
    #!/bin/bash
    # Submit trigger for next event
     strigger --set --node --down \
             --program=/usr/sbin/slurm_admin_notify
    # Notify administrator using by e-mail
    /bin/mail 
    OurSiteAdmin at OurEmailServer.edu <mailto:OurSiteAdmin at OurEmailServer.edu> -s NodesDown:$*
    
    ———
    If I run manually, I receive:
    slurm_set_trigger: Access/permission denied
    
    
    If I add: “runuser -l slurm -c” in front  of the command strigger, I  receive: 
    This account is currently not available.
    
    
    The man page also states: “Trigger events
     are not processed instantly, but a check is performed for trigger events on a periodic basis (currently every 15 seconds). “
    This leads me to believe I am missing something possibly in my install for where is that 15 seconds set?
    
    
    Any suggestions would be greatly appreciated! How are folks accomplishing this?
    Thank you!
    Jodie
    
    
    



More information about the slurm-users mailing list