<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Hi Jordan.</div><div>Regarding filling up the nodes look at</div><div><a href="https://slurm.schedmd.com/elastic_computing.html">https://slurm.schedmd.com/elastic_computing.html</a></div><div><br></div><div><dt style="text-align:left;color:rgb(70,84,92);text-transform:none;text-indent:0px;letter-spacing:normal;font-size:20px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;white-space:normal;box-sizing:border-box"><b style="margin:0px;padding:0px;border:0px;line-height:inherit;font-family:inherit;font-size:inherit;font-style:inherit;font-variant:inherit;font-weight:bold;vertical-align:baseline;box-sizing:border-box;font-size-adjust:none;font-stretch:inherit">SelectType</b>
</dt><dd style="text-align:left;color:rgb(70,84,92);text-transform:none;text-indent:0px;letter-spacing:normal;font-size:20px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;white-space:normal;box-sizing:border-box">Generally must be "select/linear".
If Slurm is configured to allocate individual CPUs to jobs rather than whole
nodes (e.g. SelectType=select/cons_res rather than SelectType=select/linear),
then Slurm maintains bitmaps to track the state of every CPU in the system.
If the number of CPUs to be allocated on each node is not known when the
slurmctld daemon is started, one must allocate whole nodes to jobs rather
than individual processors.
The use of "select/cons_res" requires each node to have a CPU count set and
the node eventually selected must have at least that number of CPUs.
</dd><b></b><i></i><u></u><sub></sub><sup></sup><strike></strike><br></div><div><br></div><div>If I am not wrong you can configure the number of CPUs per node as a fixed amount - if you select a fixed instance type</div><div><br></div><div><br></div><div><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";font-size:16px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">NOTE: This demo uses c4.2xlarge instance types for the compute nodes, which have statically set the number of CPUs=8 in </span><code style="margin:0px;padding:0.2em 0.4em;border-radius:3px;text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-size:85%;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;white-space:normal;box-sizing:border-box;background-color:rgba(27,31,35,0.05)">slurm_nodes.conf</code><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";font-size:16px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">. If you want to expierment with different instance types (in </span><code style="margin:0px;padding:0.2em 0.4em;border-radius:3px;text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-size:85%;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;white-space:normal;box-sizing:border-box;background-color:rgba(27,31,35,0.05)">slurm-aws-startup.sh</code><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";font-size:16px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">) ensure you change the CPUs in </span><code style="margin:0px;padding:0.2em 0.4em;border-radius:3px;text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-size:85%;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;white-space:normal;box-sizing:border-box;background-color:rgba(27,31,35,0.05)">slurm_nodes.conf</code><span style="text-align:left;color:rgb(36,41,46);text-transform:none;text-indent:0px;letter-spacing:normal;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";font-size:16px;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">.</span><b></b><i></i><u></u><sub></sub><sup></sup><strike></strike><br></div><div><b></b><i></i><u></u><sub></sub><sup></sup><strike></strike><br></div><div><b></b><i></i><u></u><sub></sub><sup></sup><strike></strike><br></div><div><br></div><div><br></div><div><br></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, 26 Oct 2018 at 07:13, J.R. W <<a href="mailto:jwillis0720@gmail.com">jwillis0720@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><br><blockquote type="cite"></blockquote><font color="#5856d6"><br></font>Hello everyone,<br><font color="#5856d6"><br></font>I setup a SLURM cluster based on this post and plugin. <a href="https://aws.amazon.com/blogs/compute/deploying-a-burstable-and-event-driven-hpc-cluster-on-aws-using-slurm-part-1/" target="_blank">https://aws.amazon.com/blogs/compute/deploying-a-burstable-and-event-driven-hpc-cluster-on-aws-using-slurm-part-1/</a><br><font color="#5856d6"><br></font>When I submit jobs to the queue, the AWS instances start configuring. Because I have so many potential instances, for each job, they spool up one instance. For example, if I submit 10 job, AWS will configure 10 instances. What would be ideal is if there is a slurm.conf option I’m missing that will tell the power-save plugin to only configure N amount of nodes, even though there hundreds of “available” nodes to configure in the cloud. Some potential solutions I have thought of.<br><font color="#5856d6"><br></font>1. Have the scheduler fill up nodes even if they are in the configuring state. SLURM knows how many CPUs are available for the nodes that are being configured. Is there a way to have jobs all fill up a node, even if it’s in the configuring state? That way, a queued job will not trigger the “power save resume” of a new node. <br><font color="#5856d6"><br></font>2. Some parameter in slurm.conf that has maximum nodes that can be available.<br><font color="#5856d6"><br></font>3. Modify my slurm_resum script to check for how many nodes are configured. If that number is greater than my N amount of nodes I want spun up, then do nothing. Hopefully that will just send the job back to the queue to await one of those configured nodes.<br><font color="#5856d6"><br></font>I hope I’m making sense. I know the elastic computing is a new feature<br><font color="#5856d6"><br></font>Jordan<br><font color="#5856d6"><br></font><div style="word-wrap:break-word"><div><br></div></div></div><br></div></blockquote></div>