[slurm-users] How do I change the status of a node to FAILING?

fj2770fj at fujitsu.com fj2770fj at fujitsu.com
Tue Dec 24 06:11:03 UTC 2019


HI, All

I want to check strigger's fail event, but I don't know how to get into 'FAILING' state.

man strigger:
-F, --fail
Trigger an event if the specified node goes into a FAILING state. 

How do I set the status of a node to FAILING?

[root at ohpc137pbsop-sms ~]# cat /usr/sbin/slurm_admin_notify
#!/bin/bash
# Submit trigger for next event
strigger --set --node --fail --program=/usr/sbin/slurm_admin_notify
echo "[event trigger] slurm admin notify fail event trigger run" >> /tmp/slurmctld.log

[root at ohpc137pbsop-sms ~]#
[root at ohpc137pbsop-sms ~]# chmod +x /usr/sbin/slurm_admin_notify

[root at ohpc137pbsop-sms ~]# strigger --get
     TRIG_ID RES_TYPE   RES_ID TYPE                                OFFSET USER     FLAGS PROGRAM
          18 node            * fail                                     0 root           /usr/sbin/slurm_admin_notify
[root at ohpc137pbsop-sms ~]#

Change compute node status to FAILING.

Regards,
Tomo




More information about the slurm-users mailing list