<div dir="ltr">To explain with more details.<div><br></div><div>job will be submitted based on core at any time but it will go to any random nodes but limited to 4 Nodes only.(license having some intelligence that it calculate the nodes and if it reached to 4 then it will not allow any more nodes. yes it didn't depend on the no of core available on nodes.</div><div><br><div><span style="font-family:Calibri,sans-serif;font-size:11pt">Case-1 if 4 jobs running with 4 cores each on 4 nodes
[node1, node2, node3 and node4]</span><br></div><div>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">
Again Fifth job assigned by SLURM with 4 cores on any one node of node1, node2, node3 and node4 then license will be allowed.</p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"> </p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">Case-2 if 4 jobs running with 4 cores each on 4 nodes
[node1, node2, node3 and node4]</p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif">
Again Fifth job assigned by SLURM on node5 with 4 cores then license will
not allowed [ license not found error came in this case]</p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><br></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span style="font-family:Arial,Helvetica,sans-serif;font-size:small">Regards</span><br></p><div>Navin.</div><div><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, May 6, 2020 at 7:47 PM Renfro, Michael <<a href="mailto:Renfro@tntech.edu">Renfro@tntech.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">To make sure I’m reading this correctly, you have a software license that lets you run jobs on up to 4 nodes at once, regardless of how many CPUs you use? That is, you could run any one of the following sets of jobs:<br>
<br>
- four 1-node jobs,<br>
- two 2-node jobs,<br>
- one 1-node and one 3-node job,<br>
- two 1-node and one 2-node jobs,<br>
- one 4-node job,<br>
<br>
simultaneously? And the license isn’t node-locked to specific nodes by MAC address or anything similar? But if you try to run jobs beyond what I’ve listed above, you run out of licenses, and you want those later jobs to be held until licenses are freed up?<br>
<br>
If all of those questions have an answer of ‘yes’, I think you want the remote license part of the <a href="https://slurm.schedmd.com/licenses.html" rel="noreferrer" target="_blank">https://slurm.schedmd.com/licenses.html</a>, something like:<br>
<br>
sacctmgr add resource name=software_name count=4 percentallowed=100 server=flex_host servertype=flexlm type=license<br>
<br>
and submit jobs with a '-L software_name:N’ flag where N is the number of nodes you want to run on.<br>
<br>
> On May 6, 2020, at 5:33 AM, navin srivastava <<a href="mailto:navin.altair@gmail.com" target="_blank">navin.altair@gmail.com</a>> wrote:<br>
> <br>
> Thanks Micheal.<br>
> <br>
> Actually one application license are based on node and we have 4 Node license( not a fix node). we have several nodes but when job lands on any 4 random nodes it runs on those nodes only. After that it fails if it goes to other nodes.<br>
> <br>
> can we define a custom variable and set it on the node level and when user submit it will pass that variable and then job will and onto those specific nodes?<br>
> i do not want to create a separate partition. <br>
> <br>
> is there any way to achieve this by any other method?<br>
> <br>
> Regards<br>
> Navin.<br>
> <br>
> <br>
> Regards<br>
> Navin.<br>
> <br>
> On Tue, May 5, 2020 at 7:46 PM Renfro, Michael <<a href="mailto:Renfro@tntech.edu" target="_blank">Renfro@tntech.edu</a>> wrote:<br>
> Haven’t done it yet myself, but it’s on my todo list.<br>
> <br>
> But I’d assume that if you use the FlexLM or RLM parts of that documentation, that Slurm would query the remote license server periodically and hold the job until the necessary licenses were available.<br>
> <br>
> > On May 5, 2020, at 8:37 AM, navin srivastava <<a href="mailto:navin.altair@gmail.com" target="_blank">navin.altair@gmail.com</a>> wrote:<br>
> > <br>
> > External Email Warning<br>
> > This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.<br>
> > Thanks Michael,<br>
> > <br>
> > yes i have gone through but the licenses are remote license and it will be used by outside as well not only in slurm.<br>
> > so basically i am interested to know how we can update the database dynamically to get the exact value at that point of time.<br>
> > i mean query the license server and update the database accordingly. does slurm automatically updated the value based on usage?<br>
> > <br>
> > <br>
> > Regards<br>
> > Navin.<br>
> > <br>
> > <br>
> > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael <<a href="mailto:Renfro@tntech.edu" target="_blank">Renfro@tntech.edu</a>> wrote:<br>
> > Have you seen <a href="https://slurm.schedmd.com/licenses.html" rel="noreferrer" target="_blank">https://slurm.schedmd.com/licenses.html</a> already? If the software is just for use inside the cluster, one Licenses= line in slurm.conf plus users submitting with the -L flag should suffice. Should be able to set that license value is 4 if it’s licensed per node and you can run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1 if it’s a single license good for one run from 1-4 nodes.<br>
> > <br>
> > There are also options to query a FlexLM or RLM server for license management.<br>
> > <br>
> > -- <br>
> > Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services<br>
> > 931 372-3601 / Tennessee Tech University<br>
> > <br>
> > > On May 5, 2020, at 7:54 AM, navin srivastava <<a href="mailto:navin.altair@gmail.com" target="_blank">navin.altair@gmail.com</a>> wrote:<br>
> > > <br>
> > > Hi Team,<br>
> > > <br>
> > > we have an application whose licenses is limited .it scales upto 4 nodes(~80 cores).<br>
> > > so if 4 nodes are full, in 5th node job used to get fail.<br>
> > > we want to put a restriction so that the application can't go for the execution beyond the 4 nodes and fail it should be in queue state.<br>
> > > i do not want to keep a separate partition to achieve this <a href="http://config.is" rel="noreferrer" target="_blank">config.is</a> there a way to achieve this scenario using some dynamic resource which can call the license variable on the fly and if it is reached it should keep the job in queue.<br>
> > > <br>
> > > Regards<br>
> > > Navin.<br>
> > > <br>
> > > <br>
> > > <br>
> > <br>
> <br>
<br>
</blockquote></div>