[slurm-users] slurmd -C showing and incorrect number of cores.
    mike tie 
    mtie at carleton.edu
       
    Sun Mar  8 20:24:35 UTC 2020
    
    
  
I am running a slurm client on a virtual machine.  The virtual machine
originally had a core count of 10.  But I have now increased the cores to
16, but "slurmd -C" continues to show 10.  I have increased the core count
in the slurm.conf file. and that is being seen correctly.  The state of the
node is stuck in a Drain state because of this conflict.  How do I get
slurmd -C to see the new number of cores?
I'm running slurm 18.08.  I have tried running "scontrol reconfigure" on
the head node.  I have restarted slurmd on all the client nodes, and I have
restarted slurmctld on the master node.
Where is the data about compute note CPUs stored?  I can't seem to find a
config or setting file on the compute node.
The compute node that I am working on is "liverpool"
*mtie at liverpool** ~ $* slurmd -C
NodeName=liverpool CPUs=10 Boards=1 SocketsPerBoard=10 CoresPerSocket=1
ThreadsPerCore=1 RealMemory=64263
UpTime=1-21:55:36
*mtie at liverpool** ~ $* lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             4
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            15
Model:                 6
Model name:            Common KVM processor
Stepping:              1
CPU MHz:               2600.028
BogoMIPS:              5200.05
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
L3 cache:              16384K
NUMA node0 CPU(s):     0-15
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc
nopl xtopology eagerfpu pni cx16 x2apic hypervisor lahf_lm
*mtie at liverpool** ~ $* more /etc/slurm/slurm.conf | grep liverpool
NodeName=*liverpool* NodeAddr=137.22.10.202 CPUs=16 State=UNKNOWN
PartitionName=BioSlurm Nodes=*liverpool*  Default=YES MaxTime=INFINITE
State=UP
*mtie at liverpool** ~ $* sinfo -n liverpool -o %c
CPUS
16
*mtie at liverpool** ~ $* sinfo -n liverpool -o %E
REASON
Low socket*core*thread count, Low CPUs
Any advice?
ReplyForward
<https://www.google.com/gmail/about/policy/>
<https://www.google.com/>
*Michael Tie    *Technical Director
Mathematics, Statistics, and Computer Science
 One North College Street              phn:  507-222-4067
 Northfield, MN 55057                   cel:    952-212-8933
 mtie at carleton.edu                        fax:    507-222-4312
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200308/45a5e299/attachment-0001.htm>
    
    
More information about the slurm-users
mailing list