[slurm-users] "Low socket*core*thre" - solution?
Matt Hohmeister
hohmeister at psy.fsu.edu
Wed May 2 18:28:46 MDT 2018
I have a two-node cluster: the server/compute node is a Dell PowerEdge R730; the compute node, a Dell PowerEdge R630. On both of these nodes, slurmd -C gives me the exact same line:
[me at odin slurm]$ slurmd -C
NodeName=odin CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655
[me at thor slurm]$ slurmd -C
NodeName=thor CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655
So I edited my slurm.conf appropriately:
NodeName=odin CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655
NodeName=thor CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655
...and it looks good, except for the drain on my server/compute node:
[me at odin slurm]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug* up infinite 1 drain odin
debug* up infinite 1 idle thor
...for the following reason:
[me at odin slurm]$ sinfo -R
REASON USER TIMESTAMP NODELIST
Low socket*core*thre slurm 2018-05-02T11:55:38 odin
Any ideas?
Thanks!
Matt Hohmeister
Systems and Network Administrator
Department of Psychology
Florida State University
PO Box 3064301
Tallahassee, FL 32306-4301
Phone: +1 850 645 1902
Fax: +1 850 644 7739
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180503/723aa275/attachment.html>
More information about the slurm-users
mailing list