[slurm-users] FreeMem is not equal to (RealMem - AllocMem)

Pavel Vashchenkov vashen at itam.nsc.ru
Tue Sep 14 04:52:06 UTC 2021


Hi all

I have cluster with 6 nodes, 2 GPUs per node, 256 GB of RAM per each node.

I'm interesting of a node status (its name is node05). There is a job on 
this node (38 cores, 4GB per core, 152GB total used memory on the node)

When I ask

scontrol show node node05, I get the folloing output:

NodeName=node05 Arch=x86_64 CoresPerSocket=20
    CPUAlloc=38 CPUTot=40 CPULoad=17.74
    AvailableFeatures=(null)
    ActiveFeatures=(null)
    Gres=gpu:2(S:0-1)
    NodeAddr=node05 NodeHostName=node05 Version=19.05.5
    OS=Linux 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019
    RealMemory=257433 AllocMem=155648 FreeMem=37773 Sockets=2 Boards=1
    State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
    Partitions=cpu,gpu,hybrid
    BootTime=2021-06-06T01:04:06 SlurmdStartTime=2021-06-06T01:02:31
    CfgTRES=cpu=40,mem=257433M,billing=40,gres/gpu=2
    AllocTRES=cpu=38,mem=152G
    CapWatts=n/a
    CurrentWatts=0 AveWatts=0
    ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


There  is a line "RealMemory=257433 AllocMem=155648 FreeMem=37773 
Sockets=2 Boards=1"


My question is: Why there is so few FreeMem (37 GB instead of expected 
100 GB (RealMem - AllocMem))?

PS On other nodes the situation is similar:
RealMemory=257433 AllocMem=180224 FreeMem=7913

On free node (it is not allocated for computation right now):
RealMemory=257433 AllocMem=0 FreeMem=159610


-- 
Pavel Vashchenkov




More information about the slurm-users mailing list