<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>You shouldn't have to change any parameters if you have it
      configured in the defaults. Just systemctl stop/start slurmd as
      needed.</p>
    <p><br>
    </p>
    <p>something like:</p>
    <p>scontrol update state=drain nodename=<node_to_change>
      reason="MIG reconfig"</p>
    <p><wait for it to be drained></p>
    <p>ssh <node_to_change> "systemctl stop slurmd"</p>
    <p><run reconfig stuff></p>
    <p>ssh <node_to_change> "systemctl start slurmd"</p>
    <p><br>
    </p>
    <p></p>
    <p>Not sure what would make you feel slurmd cannot run as a service
      on a dynamic node. As long as you added the options to the systemd
      defaults file for it, you should be fine (usually
      /etc/defaults/slurmd)<br>
    </p>
    <p><br>
    </p>
    <p>Brian<br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 9/23/2022 7:40 AM, Groner, Rob
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:BL0PR02MB449998D4369EBD2D924608AD80519@BL0PR02MB4499.namprd02.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
        255, 255);" class="elementToProof">
        Ya, we're still working out the mechanism for taking the node
        out, making the changes, and bringing it back. But the part I
        can't figure out is slurmd running on the remote node.  What do
        I do with it?  Do I run it standalone, and when I need to
        reconfigure, I kill -9 it and execute it again with the new
        configuration?  Or what if slurmd is running as a service (as it
        does on all our non-dynamic nodes)?  Do I stop it, change its
        service parameters and then restart it to reconfigure the node? 
        The docs on slurm for dynamic nodes don't give any indication of
        how you handle slurmd running on the dynamic node.  What is the
        preferred method?  </div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
        255, 255);" class="elementToProof">
        <br>
      </div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
        255, 255);" class="elementToProof">
        Rob</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
        255, 255);" class="elementToProof">
        <br>
      </div>
      <hr style="display:inline-block;width:98%" tabindex="-1">
      <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
          face="Calibri, sans-serif" color="#000000"><b>From:</b>
          slurm-users <a class="moz-txt-link-rfc2396E" href="mailto:slurm-users-bounces@lists.schedmd.com"><slurm-users-bounces@lists.schedmd.com></a> on
          behalf of Brian Andrus <a class="moz-txt-link-rfc2396E" href="mailto:toomuchit@gmail.com"><toomuchit@gmail.com></a><br>
          <b>Sent:</b> Friday, September 23, 2022 10:24 AM<br>
          <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:slurm-users@lists.schedmd.com">slurm-users@lists.schedmd.com</a>
          <a class="moz-txt-link-rfc2396E" href="mailto:slurm-users@lists.schedmd.com"><slurm-users@lists.schedmd.com></a><br>
          <b>Subject:</b> Re: [slurm-users] slurmd and dynamic nodes</font>
        <div> </div>
      </div>
      <div>
        <table style="border:0; display:table; width:100%;
          table-layout:fixed; border-collapse:seperate; float:none"
          width="100%" cellspacing="0" cellpadding="0" border="0"
          align="left">
          <tbody>
            <tr>
              <td cellpadding="7px 2px 7px 2px" style="padding:7px 2px
                7px 2px; background-color:#A6A6A6" width="1px"
                valign="middle" bgcolor="#A6A6A6">
                <br>
              </td>
              <td cellpadding="7px 5px 7px 15px" color="#212121"
                style="width:100%; background-color:#EAEAEA; padding:7px
                5px 7px 15px; font-family:wf_segoe-ui_normal,Segoe
                UI,Segoe WP,Tahoma,Arial,sans-serif; font-size:12px;
                font-weight:normal; color:#212121; text-align:left;
                word-wrap:break-word" width="100%" valign="middle"
                bgcolor="#EAEAEA">
                <div>You don't often get email from <a class="moz-txt-link-abbreviated" href="mailto:toomuchit@gmail.com">toomuchit@gmail.com</a>.
                  <a
                    href="https://aka.ms/LearnAboutSenderIdentification"
                    moz-do-not-send="true">
                    Learn why this is important</a></div>
              </td>
              <td cellpadding="7px 5px 7px 5px" color="#212121"
                style="width:75px; background-color:#EAEAEA; padding:7px
                5px 7px 5px; font-family:wf_segoe-ui_normal,Segoe
                UI,Segoe WP,Tahoma,Arial,sans-serif; font-size:12px;
                font-weight:normal; color:#212121; text-align:left;
                word-wrap:break-word" width="75px" valign="middle"
                bgcolor="#EAEAEA" align="left">
                <br>
              </td>
            </tr>
          </tbody>
        </table>
        <div>
          <p><br>
          </p>
          <p>Just off the top of my head here.</p>
          <p>I would expect you need to have no jobs currently running
            on the node, so you could could submit a job to the node
            that sets the node to drain, does any local things needed,
            then exits. As part of the EpilogSlurmctld script, you could
            check for drained nodes based on some reason (like 'MIG
            reconfig') and do the head node steps there, with a final
            bit of bringing it back online.
            <br>
          </p>
          <p><br>
          </p>
          <p>Or just do all those steps from a script outside slurm
            itself, on the head node. You can use ssh/pdsh to connect to
            a node and execute things there while it is out of the mix.<br>
          </p>
          <p><br>
          </p>
          <p>Brian Andrus<br>
          </p>
          <p><br>
          </p>
          <div class="x_moz-cite-prefix">On 9/23/2022 7:09 AM, Groner,
            Rob wrote:<br>
          </div>
          <blockquote type="cite">
            <style type="text/css" style="display:none">p
        {margin-top:0;
        margin-bottom:0}</style>
            <div class="x_elementToProof"
              style="font-family:Calibri,Arial,Helvetica,sans-serif;
              font-size:12pt; color:rgb(0,0,0);
              background-color:rgb(255,255,255)">
               <br>
            </div>
            <div dir="ltr">
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                I'm working through how to use the new dynamic node
                features in order to take down a particular node,
                reconfigure it <span class="x_x_ContentPasted0"
                  style="color:rgb(0,0,0);
                  background-color:rgb(255,255,255);
                  display:inline!important">(using nvidia MIG to change
                  the number of graphic cores available)</span> and give
                it back to slurm.</div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                I'm at the point where I can take a node out of slurm's
                control from the master node (scontrol delete
                nodename....), make the nvidia-smi change, and then
                execute slurmd on the node with the changed
                configuration parameters.  It then does show up again in
                the sinfo output on the master node, with the correct
                new resources.</div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                What I'm not sure about is...when I want to reconfigure
                the <span class="x_x_ContentPasted1"
                  style="color:rgb(0,0,0);
                  background-color:rgb(255,255,255);
                  display:inline!important">
                  dynamic </span>node AGAIN, how do I do that on the
                target node?  I can use "scontrol delete" again on the
                scheduler node, but on the
                <span style="color:rgb(0,0,0);
                  background-color:rgb(255,255,255);
                  display:inline!important">
                  dynamic</span> node, slurmd will still be running. 
                Currently, for testing purposes, I just find the process
                ID and kill -9 it.  Then I change the node configuration
                and execute "slurmd -Z --conf=...." again.  </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                Is there a more elegant way to change the configuration
                on the <span style="color:rgb(0,0,0);
                  background-color:rgb(255,255,255);
                  display:inline!important">
                  dynamic</span> node than by killing the existing
                slurmd process and starting it again? </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                I'll note that I tried doing everything from the master
                (slurmctld) node, since there is an option of creating
                the node there with "scontrol create" instead of using
                slurmd on the dynamic node.  But when i tried that, the
                dynamic node I created showed up in sinfo output with a
                ~ next to it (powered off).  The dynamic node docs page
                online did not mention what, if anything, slurmd was
                supposed to be running as on the dynamic node if
                attempting to handle delete and create only on the
                master node. </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                Thanks.</div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                Rob</div>
              <div class="x_x_elementToProof"
                style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0);
                background-color:rgb(255,255,255)">
                <br>
              </div>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
  </body>
</html>