[slurm-users] sacct: error
Eric F. Alemany
ealemany at stanford.edu
Sat May 5 10:00:44 MDT 2018
Working on weekends - hey ?
when i do "slurmd -C” on one of my execute node, i get:
eric at radonc01:~$ slurmd -C
NodeName=radonc01 slurmd: Considering each NUMA node as a socket
CPUs=32 Boards=1 SocketsPerBoard=4 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=64402
Also, when i do “lscpu” i get:
eric at radonc01:~$ lscpu
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
NUMA node(s): 4
Vendor ID: AuthenticAMD
CPU family: 21
Model name: AMD Opteron(tm) Processor 6376
It seems as the commands give different result (?) - What do you think ?
Eric F. Alemany
System Administrator for Research
Division of Radiation & Cancer Biology
Department of Radiation Oncology
Stanford University School of Medicine
Stanford, California 94305
Tel:1-650-498-7969<tel:1-650-498-7969> No Texting
On May 5, 2018, at 5:42 AM, Chris Samuel <chris at csamuel.org<mailto:chris at csamuel.org>> wrote:
On Saturday, 5 May 2018 2:45:19 AM AEST Eric F. Alemany wrote:
With Ray suggestion i have a error message for each nodes. Here i am giving
you only one error message from a node.
sacct: error: NodeNames=radonc01 CPUs=32 doesn't match
Sockets*CoresPerSocket*ThreadsPerCore (16), resetting CPUs
The interesting thing is if you follow the
Sockets*CoresPerSocket*ThreadsPerCore formula 2x8x2 = 32 however look above
and it says (16) - Strange, no ?
No, Slurm is right. CPUS != threads. You've got 16 CPU cores, each with 2
threads. So in this configuration you can schedule 16 tasks per node and each
task can use 2 threads.
What does "slurmd -C" say on that node?
All the best,
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users