[slurm-users] "Low socket*core*thre" - solution?

John Kelly john.kelly at broadcom.com
Wed May 2 19:10:52 MDT 2018


Hi matt

scontrol update nodename=odin state=resume
scontrol update nodename=odin state=idle

-jfk



On Wed, May 2, 2018 at 5:28 PM, Matt Hohmeister <hohmeister at psy.fsu.edu>
wrote:

> I have a two-node cluster: the server/compute node is a Dell PowerEdge
> R730; the compute node, a Dell PowerEdge R630. On both of these nodes, slurmd
> -C gives me the exact same line:
>
>
>
> [me at odin slurm]$ slurmd -C
>
> NodeName=odin CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10
> ThreadsPerCore=2 RealMemory=128655
>
>
>
> [me at thor slurm]$ slurmd -C
>
> NodeName=thor CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10
> ThreadsPerCore=2 RealMemory=128655
>
>
>
> So I edited my slurm.conf appropriately:
>
>
>
> NodeName=odin CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10
> ThreadsPerCore=2 RealMemory=128655
>
> NodeName=thor CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=10
> ThreadsPerCore=2 RealMemory=128655
>
>
>
> …and it looks good, except for the drain on my server/compute node:
>
>
>
> [me at odin slurm]$ sinfo
>
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
>
> debug*       up   infinite      1  drain odin
>
> debug*       up   infinite      1   idle thor
>
>
>
> …for the following reason:
>
>
>
> [me at odin slurm]$ sinfo -R
>
> REASON               USER      TIMESTAMP           NODELIST
>
> Low socket*core*thre slurm     2018-05-02T11:55:38 odin
>
>
>
> Any ideas?
>
>
>
> Thanks!
>
>
>
> Matt Hohmeister
>
> Systems and Network Administrator
>
> Department of Psychology
>
> Florida State University
>
> PO Box 3064301
>
> Tallahassee, FL 32306-4301
>
> Phone: +1 850 645 1902
>
> Fax: +1 850 644 7739
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180502/ac83c329/attachment.html>


More information about the slurm-users mailing list