[slurm-users] seff: incorrect memory usage (18.08.5-2)
Christopher Benjamin Coffey
Chris.Coffey at nau.edu
Tue Feb 26 14:32:57 UTC 2019
Hi Loris,
Odd, we never saw that issue with memory efficiency being out of whack, just the cpu efficiency. We are running 18.08.5-2 and here is a 512 core job run last night:
Job ID: 18096693
Array Job ID: 18096693_5
Cluster: monsoon
User/Group: abc123/cluster
State: COMPLETED (exit code 0)
Nodes: 60
Cores per node: 8
CPU Utilized: 01:34:06
CPU Efficiency: 58.04% of 02:42:08 core-walltime
Job Wall-clock time: 00:00:19
Memory Utilized: 36.04 GB (estimated maximum)
Memory Efficiency: 30.76% of 117.19 GB (1.95 GB/node
What job collection, task, and proc track plugin are you using I'm curious? We are using:
JobAcctGatherType=jobacct_gather/cgroup
TaskPlugin=task/cgroup,task/affinity
ProctrackType=proctrack/cgroup
Also cgroup.conf:
ConstrainCores=yes
ConstrainRAMSpace=yes
Best,
Chris
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
On 2/26/19, 2:15 AM, "slurm-users on behalf of Loris Bennett" <slurm-users-bounces at lists.schedmd.com on behalf of loris.bennett at fu-berlin.de> wrote:
Hi,
With seff 18.08.5-2 we have been getting spurious results regarding
memory usage:
$ seff 1230_27
Job ID: 1234
Array Job ID: 1230_27
Cluster: curta
User/Group: xxxxxxxxx/xxxxxxxxx
State: COMPLETED (exit code 0)
Nodes: 4
Cores per node: 25
CPU Utilized: 9-16:49:18
CPU Efficiency: 30.90% of 31-09:35:00 core-walltime
Job Wall-clock time: 07:32:09
Memory Utilized: 48.00 EB (estimated maximum)
Memory Efficiency: 26388279066.62% of 195.31 GB (1.95 GB/core)
It seems that the more cores are involved the worse the overcalulation
is, but not linearly.
Has anyone else seen this?
Cheers,
Loris
--
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.bennett at fu-berlin.de
More information about the slurm-users
mailing list