[slurm-users] Slurm memory units

Peter Kjellström cap at nsc.liu.se
Wed May 6 10:12:45 UTC 2020


On Wed, 6 May 2020 10:42:46 +0100
Killian Murphy <killian.murphy at york.ac.uk> wrote:

> Hi all.
> 
> I'm probably making a rookie error here...which 'megabyte' (powers of
> 1000 or 1024) does the Slurm documentation refer to in, for example,
> the slurm.conf documentation for RealMemory and the sbatch
> documentation for `--mem`?
> 
> Most of our nodes have the same physical memory configuration. From
> the output of `free -m` and `slurmd -C` on one of the nodes, we have
> 191668M (187G). Consequently, `RealMemory` for those nodes has been
> set to 191668 in slurm.conf. As a result, when a user requests memory
> above '187G' for node memory, Slurm reports to them that the
> requested node configuration is not available.

Well yeah it's all 2-base. But it seems you have two problems 1) the
units 2) users expecting 192GiB out of your nodes but the actual
available memory is always lower (187G in your case).

We see #2 also in that users know our thin nodes are 96G and some then
proceed to request 96G (which does not fit on the 96G nodes...).

From our system:
 $ LOCALINTERACTIVECOMMAND interactive --mem=3g -n 1
 ...
 $
 cat /sys/fs/cgroup/memory/slurm/uid_x/job_y/memory.limit_in_bytes
 3221225472 # 3.0 GiB

(this on slurm-18.08.8 with mem cgroups)

/Peter

> Only...we have 191668MiB of system memory, not MB. `free -m --si` (use
> powers of 1000, not 1024) reports 192628MB of system memory (which,
> frustratingly indicates that the 'free' documentation is also not
> using the newer unit names). So it seems as though Slurm is working
> in powers of 1024, not powers of 1000.
> 
> I'm probably just confused about the unit definitions, or there is
> some convention being applied here, but would appreciate some
> confirmation either way!
> 
> Thanks.
> 
> Killian




More information about the slurm-users mailing list