<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>David,</p>
    <p>For monitoring, I use a combination of netdata+prometheus. Data
      is gathered whenever the nodes are up and stored for history. Yes,
      when the nodes are powered down, there are empty gaps, but that is
      interpreted as the node is powered down.</p>
    <p>For the config, I have no access to DNS for configless so I use a
      symlink to the slurm.conf file a shared filesystem. This works
      great. Anytime there are changes, a simple 'scontrol reconfigure'
      brings all running nodes up to speed and any down nodes will
      automatically read the latest.</p>
    <p>Brian Andrus<br>
    </p>
    <div class="moz-cite-prefix">On 2/23/2022 2:31 AM, David Simpson
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CWLP265MB073837D08BD79CBF99E56E9BBE3C9@CWLP265MB0738.GBRP265.PROD.OUTLOOK.COM">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style>@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}div.WordSection1
        {page:WordSection1;}ol
        {margin-bottom:0cm;}ul
        {margin-bottom:0cm;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal">Hi all,<br>
          <br>
          Interested to know what common approaches were to:<br>
          <br>
          <o:p></o:p></p>
        <ul style="margin-top:0cm" type="disc">
          <li class="MsoListParagraph"
            style="margin-left:0cm;mso-list:l0 level1 lfo1">Monitoring
            of power saving nodes (e.g. health of the node), when
            potentially the monitoring system will see it go up and
            down. Do you limit to BMC only monitoring/health?<o:p></o:p></li>
          <li class="MsoListParagraph"
            style="margin-left:0cm;mso-list:l0 level1 lfo1">When you
            want to make changes to slurm.conf (or anything else) to a
            node which is down due to power saving (during a
            maintenance/reservation) what is your approach? Do you end
            up with 2 slurm.confs (one for power saving and one that
            keeps everything up, to work on during the maintenance)?<o:p></o:p></li>
        </ul>
        <p class="MsoNormal"><br>
          thanks<br>
          David<br>
          <br>
          <o:p></o:p></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">-------------<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">David
            Simpson - Senior Systems Engineer<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">ARCCA,
            Redwood Building,<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">King
            Edward VII Avenue,<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">Cardiff,
            CF10
3NB                                                                              <o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">David
            Simpson - peiriannydd uwch systemau<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">ARCCA,
            Adeilad Redwood,<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">King
            Edward VII Avenue,<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">Caerdydd,
            CF10 3NB<o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB"><a
              href="mailto:simpsond4@cardiff.ac.uk"
              moz-do-not-send="true"><span style="color:#0563C1">simpsond4@cardiff.ac.uk</span></a><o:p></o:p></span></p>
        <p class="MsoNormal"><span
            style="font-size:9.0pt;mso-fareast-language:EN-GB">+44 29208
            74657<o:p></o:p></span></p>
        <p class="MsoNormal"><o:p> </o:p></p>
      </div>
    </blockquote>
  </body>
</html>