[slurm-users] Memory usage not tracked
Xand Meaden
xand.meaden at kcl.ac.uk
Wed Jan 12 17:23:01 UTC 2022
Hi,
We wish to record memory usage of HPC jobs, but with Slurm 20.11 cannot
get this to work - the information is simply missing. Our two older
clusters with Slurm 19.05 will record memory usage as a TRES, e.g. as
shown below:
# sacct --format=JobID,State,AllocTRES%32|grep RUNNING|head -4
14029267 RUNNING billing=32,cpu=32,mem=185600M,n+
14037739 RUNNING billing=64,cpu=64,mem=250G,node+
14037739.ba+ RUNNING cpu=32,mem=125G,node=1
14037739.0 RUNNING cpu=1,mem=4000M,node=1
However with 20.11 we see no memory usage:
# sacct --format=JobID,State,AllocTRES%32|grep RUNNING|head -4
771 RUNNING billing=36,cpu=36,node=1
771.batch RUNNING cpu=36,mem=0,node=1
816 RUNNING billing=128,cpu=128,node=1
823 RUNNING billing=36,cpu=36,node=1
I've also checked within the slurm database's cluster_job_table, and
tres_alloc has no "2=" (memory) value for any job.
From my reading of https://slurm.schedmd.com/tres.html it's not possible
to disable memory as a TRES, so I can't figure out what I'm missing
here. The 20.11 cluster is running on Ubuntu 20.04 (vs CentOS 7 for the
others), in case that makes any difference!
Thanks in advance,
Xand
More information about the slurm-users
mailing list