<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>You could set up an dummy node that has the features that are not
active but not allow jobs to schedule to that node by setting it
to DOWN. That would be a hacky way of accomplishing this.</p>
<p>-Paul Edmon-<br>
</p>
<div class="moz-cite-prefix">On 7/9/2020 7:15 PM, Raj Sahae wrote:<br>
</div>
<blockquote type="cite"
cite="mid:1D486FC2-26C5-4E55-90E3-85340C01C7F4@teslamotors.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Times New Roman \(Body CS\)";
panose-1:2 11 6 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Arial",sans-serif;
color:windowtext;
font-weight:normal;
font-style:normal;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:12.0pt;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1510099476;
mso-list-template-ids:1415593990;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style>
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">Hi
all,</span><span style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">My
apologies if this is sent twice. The first time I sent it
without my subscription to the list being complete.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">I
am attempting to use Slurm as a test automation system for
its fairly advanced queueing and job control abilities, and
also because it scales very well.</span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">However,
since our use case is a bit outside the standard usage of
Slurm, we are hitting some issues that don’t appear to have
obvious solutions.</span><span style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"> </span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">In
our current setup, the Slurm nodes are hosts attached to a
test system. Our pipeline (greatly simplified) would be to
install some software on the test system and then run sets
of tests against it.</span><span style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">In
our old pipeline, this was done in a single job, however
with Slurm I was hoping to decouple these two actions as it
makes the entire pipeline more robust to update failures and
would give us more finely grained job control for the actual
test run.</span><span style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"> </span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">I
would like to allow users to queue jobs with constraints
indicating which software version they need. Then separately
some automated job would scan the queue, see jobs that are
not being allocated due to missing resources, and queue
software installs appropriately. We attempted to do this
using the Active/Available Features configuration. We use
HealthCheck and Epilog scripts to scrape the test system for
software properties (version, commit, etc.) and assign them
as Features. Once an install is complete and the Features
are updated, queued jobs would start to be allocated on
those nodes.</span><span style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"> </span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">Herein
lies the conundrum. If a user submits a job, constraining to
run on Version A, but all nodes in the cluster are currently
configured with Features=Version-B, Slurm will fail to queue
the job, indicating an invalid feature specification. I
completely understand why Features are implemented this way,
so my question is, is there some workaround or other Slurm
capabilities that I could use to achieve this behavior?
Otherwise my options seem to be:</span><span
style="color:black"><o:p></o:p></span></p>
<ol style="margin-top:0in" type="1" start="1">
<li class="MsoNormal" style="color:black;mso-list:l0 level1
lfo1"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif">Go
back to how we did it before. The pipeline would have the
same level of robustness as before but at least we would
still be able to leverage other queueing capabilities of
Slurm.</span><o:p></o:p></li>
<li class="MsoNormal" style="color:black;mso-list:l0 level1
lfo1"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif">Write
our own Feature or Job Submit plugin that customizes this
behavior just for us. Seems possible but adds lead time
and complexity to the situation.</span><o:p></o:p></li>
</ol>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"> </span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">It's
not feasible to update the config for all
branches/versions/commits to be AvailableFeatures, as our
branch ecosystem is quite large and the maintenance of that
approach would not scale well.</span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"> </span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black">Thanks,</span><span
style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal" style="line-height:14.3pt"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p> </o:p></span></p>
<p class="MsoNormal" style="line-height:14.3pt"><b><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#7F7F7F"
lang="EN-GB">Raj Sahae | Manager, Software QA</span></b><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p></o:p></span></p>
<p class="MsoNormal" style="line-height:14.3pt"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#7F7F7F"
lang="FR">3500 Deer Creek Rd, Palo Alto, CA 94304</span><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p></o:p></span></p>
<p class="MsoNormal" style="line-height:14.3pt"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#7F7F7F"
lang="FR">m. +1 (408) 230-8531 | </span><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><a
href="file:///composeviewinternalloadurl/%3Cmailto:rsahae@tesla.com%3E"
moz-do-not-send="true"><span style="color:blue">rsahae@tesla.com</span></a><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"
lang="FR"> </span><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><a href="http://www.tesla.com/"
moz-do-not-send="true"><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#7F7F7F;text-decoration:none"><img
style="width:1.3333in;height:.1354in"
id="Picture_x0020_1"
src="cid:part2.F5CDBAF6.88398E8D@cfa.harvard.edu"
class="" width="128" height="13" border="0"></span></a><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#7F7F7F"
lang="FR"> </span><span
style="font-size:10.0pt;font-family:"Arial",sans-serif;color:black"><o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</blockquote>
</body>
</html>