[slurm-users] FreeMem is not equal to (RealMem - AllocMem)
Pavel Vashchenkov
vashen at itam.nsc.ru
Tue Sep 14 04:52:06 UTC 2021
Hi all
I have cluster with 6 nodes, 2 GPUs per node, 256 GB of RAM per each node.
I'm interesting of a node status (its name is node05). There is a job on
this node (38 cores, 4GB per core, 152GB total used memory on the node)
When I ask
scontrol show node node05, I get the folloing output:
NodeName=node05 Arch=x86_64 CoresPerSocket=20
CPUAlloc=38 CPUTot=40 CPULoad=17.74
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=gpu:2(S:0-1)
NodeAddr=node05 NodeHostName=node05 Version=19.05.5
OS=Linux 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019
RealMemory=257433 AllocMem=155648 FreeMem=37773 Sockets=2 Boards=1
State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=cpu,gpu,hybrid
BootTime=2021-06-06T01:04:06 SlurmdStartTime=2021-06-06T01:02:31
CfgTRES=cpu=40,mem=257433M,billing=40,gres/gpu=2
AllocTRES=cpu=38,mem=152G
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
There is a line "RealMemory=257433 AllocMem=155648 FreeMem=37773
Sockets=2 Boards=1"
My question is: Why there is so few FreeMem (37 GB instead of expected
100 GB (RealMem - AllocMem))?
PS On other nodes the situation is similar:
RealMemory=257433 AllocMem=180224 FreeMem=7913
On free node (it is not allocated for computation right now):
RealMemory=257433 AllocMem=0 FreeMem=159610
--
Pavel Vashchenkov
More information about the slurm-users
mailing list