[slurm-users] Memory usage not tracked

Xand Meaden xand.meaden at kcl.ac.uk
Wed Jan 12 17:23:01 UTC 2022


Hi,

We wish to record memory usage of HPC jobs, but with Slurm 20.11 cannot
get this to work - the information is simply missing. Our two older
clusters with Slurm 19.05 will record memory usage as a TRES, e.g. as
shown below:

# sacct --format=JobID,State,AllocTRES%32|grep RUNNING|head -4
14029267        RUNNING billing=32,cpu=32,mem=185600M,n+
14037739        RUNNING billing=64,cpu=64,mem=250G,node+
14037739.ba+    RUNNING           cpu=32,mem=125G,node=1
14037739.0      RUNNING           cpu=1,mem=4000M,node=1

However with 20.11 we see no memory usage:

# sacct --format=JobID,State,AllocTRES%32|grep RUNNING|head -4
771             RUNNING         billing=36,cpu=36,node=1
771.batch       RUNNING              cpu=36,mem=0,node=1
816             RUNNING       billing=128,cpu=128,node=1
823             RUNNING         billing=36,cpu=36,node=1

I've also checked within the slurm database's cluster_job_table, and
tres_alloc has no "2=" (memory) value for any job.

From my reading of https://slurm.schedmd.com/tres.html it's not possible
to disable memory as a TRES, so I can't figure out what I'm missing
here. The 20.11 cluster is running on Ubuntu 20.04 (vs CentOS 7 for the
others), in case that makes any difference!

Thanks in advance,
Xand



More information about the slurm-users mailing list