<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal>I’m pretty sure that you should only need to restart slurmd on the node that was reporting the problem. If it put the node into a drained state you may need to manually undrain it using scontrol.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Testing job performance is not the job of the scheduler it just schedules the jobs that you tell it to. You’ll need to run those tests yourself. <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Mike<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:12.0pt;color:black'>From: </span></b><span style='font-size:12.0pt;color:black'>slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of Robert Kudyba <rkudyba@fordham.edu><br><b>Reply-To: </b>Slurm User Community List <slurm-users@lists.schedmd.com><br><b>Date: </b>Thursday, April 23, 2020 at 12:55<br><b>To: </b>Slurm User Community List <slurm-users@lists.schedmd.com><br><b>Subject: </b>Re: [slurm-users] [External] slurmd: error: Node configuration differs from hardware: CPUs=24:48(hw) Boards=1:1(hw) SocketsPerBoard=2:2(hw)<o:p></o:p></span></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div style='border:solid #9C6500 1.0pt;padding:2.0pt 2.0pt 2.0pt 2.0pt'><p class=MsoNormal style='line-height:12.0pt;background:#FFEB9C'><b><span style='font-size:10.0pt;color:#9C6500'>CAUTION:</span></b><span style='font-size:10.0pt;color:black'> This email originated from outside of the Colorado School of Mines organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.<o:p></o:p></span></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><div><p class=MsoNormal><o:p> </o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>On Thu, Apr 23, 2020 at 1:43 PM Michael Robbert <<a href="mailto:mrobbert@mines.edu">mrobbert@mines.edu</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>It looks like you have hyper-threading turned on, but haven’t defined the ThreadsPerCore=2. You either need to turn off Hyper-threading in the BIOS or changed the definition of ThreadsPerCore in slurm.conf.<o:p></o:p></p></div></div></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Nice find. node003 has hyper threading enabled but node001 and node002 do not:<o:p></o:p></p></div><div><p class=MsoNormal><span style='font-family:"Courier New"'>[root@node001 ~]# dmidecode -t processor | grep -E '(Core Count|Thread Count)'<br> Core Count: 12<br> Thread Count: 12<br> Core Count: 12<br> Thread Count: 12<br><br>[root@node003 ~]# dmidecode -t processor | grep -E '(Core Count|Thread Count)'<br> Core Count: 12<br> Thread Count: 24<br> Core Count: 12</span><o:p></o:p></p></div><div><p class=MsoNormal>I <a href="https://serverfault.com/a/792264/359447">found a great mini script</a> to disable hyperthreading without reboot. I did get the following warning but I don't think it's a big issue:<o:p></o:p></p></div><div><p class=MsoNormal><span style='font-family:"Courier New"'> WARNING, didn't collect load info for all cpus, balancing is broken</span><o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Do I have to restart slurmctl on the head node and/or slurmd on node003?<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Side question, are there ways with Slurm to test if hyperthreading improves performance and job speed?<o:p></o:p></p></div></div></div></div></div></body></html>