[slurm-users] "Low socket*core*thre" - solution?

Matt Hohmeister hohmeister at psy.fsu.edu
Wed May 2 18:28:46 MDT 2018


I have a two-node cluster: the server/compute node is a Dell PowerEdge R730; the compute node, a Dell PowerEdge R630. On both of these nodes, slurmd -C gives me the exact same line:

[me at odin slurm]$ slurmd -C
NodeName=odin CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655

[me at thor slurm]$ slurmd -C
NodeName=thor CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655

So I edited my slurm.conf appropriately:

NodeName=odin CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655
NodeName=thor CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10 ThreadsPerCore=2 RealMemory=128655

...and it looks good, except for the drain on my server/compute node:

[me at odin slurm]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug*       up   infinite      1  drain odin
debug*       up   infinite      1   idle thor

...for the following reason:

[me at odin slurm]$ sinfo -R
REASON               USER      TIMESTAMP           NODELIST
Low socket*core*thre slurm     2018-05-02T11:55:38 odin

Any ideas?

Thanks!

Matt Hohmeister
Systems and Network Administrator
Department of Psychology
Florida State University
PO Box 3064301
Tallahassee, FL 32306-4301
Phone: +1 850 645 1902
Fax: +1 850 644 7739

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180503/723aa275/attachment.html>


More information about the slurm-users mailing list