<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><font face="Nimbus Roman No9 L">Dear Loris: Many thanks for your
response. <br>
</font></p>
<p><font face="Nimbus Roman No9 L">I did change the IDLE state to
UNKNOWN state for NodeName configuration, then reloaded <b>slurmctld</b>
and got 2 gpu nodes(gpu3 & 4) as drain mode. Although the
same state I have manually updated to IDLE state.<br>
</font></p>
<p><font face="Nimbus Roman No9 L">But how do I change the
CoresPerSocket and ThreadsPerCore in the NodeName parameter?<br>
</font></p>
<p><font face="Nimbus Roman No9 L"><img
src="cid:part1.A69613EC.8B665743@iitgn.ac.in" alt=""><br>
<img src="cid:part2.BD985D28.A96141A9@iitgn.ac.in" alt=""></font></p>
<pre class="moz-signature" cols="72">Thanks & Regards,
Sudeep Narayan Banerjee</pre>
<div class="moz-cite-prefix">On 18/05/20 7:29 pm, Loris Bennett
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:87sgfx5op9.fsf@hornfels.zedat.fu-berlin.de">
<pre class="moz-quote-pre" wrap="">Hi Sudeep,
I am not sure if this is the cause of the problem but in your slurm.conf
you have
# COMPUTE NODES
NodeName=node[1-10] Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 Procs=16 RealMemory=60000 State=IDLE
NodeName=gpu[1-2] CPUs=16 Gres=gpu:2 State=IDLE
NodeName=node[11-22] Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 Procs=32 State=IDLE
NodeName=node[23-24] Sockets=2 CoresPerSocket=20 ThreadsPerCore=1 Procs=40 State=IDLE
NodeName=gpu[3-4] CPUs=32 Gres=gpu:1 State=IDLE
But if you read
man slurm.conf
you will find the following under the description of the parameter
"State" for nodes:
"IDLE" should not be specified in the node configuration, but set the
node state to "UNKNOWN" instead.
Cheers,
Loris
Sudeep Narayan Banerjee <a class="moz-txt-link-rfc2396E" href="mailto:snbanerjee@iitgn.ac.in"><snbanerjee@iitgn.ac.in></a> writes:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Dear Loris: I am very sorry to address as Support; actually it has
become a bad habit for me which I will change. Sincere Apologies!
Yes, I have checked while adding hybrid arch of hardware but while
executing slurmctld, it shows mismatch in core-count and also the
existing 32core nodes goes to Dowm/Drng mode and new 40-core nodes
sets to IDLE.
Any help/guide to some link will be highly appreciated!
Thanks & Regards,
Sudeep Narayan Banerjee
System Analyst | Scientist B
Information System Technology Facility
Academic Block 5 | Room 110
Indian Institute of Technology Gandhinagar
Palaj, Gujarat 382355 INDIA
On 18/05/20 6:30 pm, Loris Bennett wrote:
Dear Sudeep,
Sudeep Narayan Banerjee <a class="moz-txt-link-rfc2396E" href="mailto:snbanerjee@iitgn.ac.in"><snbanerjee@iitgn.ac.in></a> writes:
Dear Support,
This mailing list is not really the Slurm support list. It is just the
Slurm User Community List, so basically a bunch of people just like you.
node11-22 is having 16cores socket x 2 and node23-24 is having 20cores
socket x 2. In slurm.conf file (attached), can we merge all the nodes
11-24 (having different core count) and have a single queue or
partition name?
Yes, you can have a partition consisting of heterogeneous nodes. Have
you tried this? Was there a problem?
Cheers,
Loris
</pre>
</blockquote>
</blockquote>
</body>
</html>