[slurm-users] [External] slurmd: error: Node configuration differs from hardware: CPUs=24:48(hw) Boards=1:1(hw) SocketsPerBoard=2:2(hw)
Robert Kudyba
rkudyba at fordham.edu
Thu Apr 23 18:52:41 UTC 2020
On Thu, Apr 23, 2020 at 1:43 PM Michael Robbert <mrobbert at mines.edu> wrote:
> It looks like you have hyper-threading turned on, but haven’t defined the
> ThreadsPerCore=2. You either need to turn off Hyper-threading in the BIOS
> or changed the definition of ThreadsPerCore in slurm.conf.
>
Nice find. node003 has hyper threading enabled but node001 and node002 do
not:
[root at node001 ~]# dmidecode -t processor | grep -E '(Core Count|Thread
Count)'
Core Count: 12
Thread Count: 12
Core Count: 12
Thread Count: 12
[root at node003 ~]# dmidecode -t processor | grep -E '(Core Count|Thread
Count)'
Core Count: 12
Thread Count: 24
Core Count: 12
I found a great mini script <https://serverfault.com/a/792264/359447> to
disable hyperthreading without reboot. I did get the following warning but
I don't think it's a big issue:
WARNING, didn't collect load info for all cpus, balancing is broken
Do I have to restart slurmctl on the head node and/or slurmd on node003?
Side question, are there ways with Slurm to test if hyperthreading improves
performance and job speed?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200423/d4efef21/attachment.htm>
More information about the slurm-users
mailing list