[slurm-users] ConstrainRAMSpace=yes and page cache?

Juergen Salk juergen.salk at uni-ulm.de
Thu Jun 13 21:38:10 UTC 2019


Dear all,

I'm just starting to get used to Slurm and play around with it in a small test
environment within our old cluster.

For our next system we will probably have to abandon our current exclusive user
node access policy in favor of a shared user policy, i.e. jobs from different
users will then run side by side on the same node at the same time. In order to
prevent the jobs from interfering with each other, I have set both
ConstrainCores=yes and ConstrainRAMSpace=yes in cgroups.conf, which works as
expected for limiting the memory of the processes to the value requested at job
submission (e.g. by --mem=... option).

However, I've noticed that ConstrainRAMSpace=yes does also cap the available
page cache for which the Linux kernel normally exploits any unused areas of the
memory in a flexible way. This may result in a significant performance impact
as we do have quite a number of IO demanding applications (predominated by read
operations) that are known to benefit a lot from page caching.

Here comes a small example to illustrate this issue. The job writes a 16 GB
file to a local scratch file system, measures the amount of data cached in
memory and then reads the file previously written.

$ cat job.slurm
#!/bin/bash
#SBATCH --partition=standard
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=00:10:00

# Get amount of data cached in memory before writing the file
cached1=`awk '$1=="Cached:" {print $2}' /proc/meminfo`

# Write 16 GB file to local scratch SSD
dd if=/dev/zero of=$SCRATCH/testfile count=16 bs=1024M

# Get amount of data cached in memory after writing the file
cached2=`awk '$1=="Cached:" {print $2}' /proc/meminfo`

# Print difference of data cached in memory
echo -e "\nIncreased cached data by $(((cached2-cached1)/1000000)) GB\n"

# Read the file previously written
dd if=$SCRATCH/testfile of=/dev/null count=16 bs=1024M

$

For reference, this is the result *without* ConstrainRAMSpace=yes
set in cgroups.conf and submitted with `sbatch --mem=2G --gres=scratch:16 job.slurm´

--- snip ---
16+0 records in
16+0 records out
17179869184 bytes (17 GB) copied, 10.9839 s, 1.6 GB/s

Increased cached data by 16 GB

16+0 records in
16+0 records out
17179869184 bytes (17 GB) copied, 5.03225 s, 3.4 GB/s
--- snip ---

Note that there is 16 GB of data cached and the read
performance is 3.4 GB/s as the data is actually read from page
cache.

And this is the result *with* ConstrainRAMSpace=yes set in cgroups.conf
and submitted with the very same command:

--- snip ---
16+0 records in
16+0 records out
17179869184 bytes (17 GB) copied, 13.3163 s, 1.3 GB/s

Increased cached data by 1 GB

16+0 records in
16+0 records out
17179869184 bytes (17 GB) copied, 11.1098 s, 1.5 GB/s
--- snip ---

Now only 1 GB of data has been cached (which is roughly
the 2 GB that have been requested for the job minus 1 GB
allocated by the dd buffer) resulting in a read performance
degradation to 1.5 GB/s (compared to 3.4 GB/s as above).

Finally, this is the result with *with* ConstrainRAMSpace=yes
set in cgroups.conf and the job submitted with
`sbatch --mem=18G --gres=scratch:16 job.slurm´:

--- snip ---
16+0 records in
16+0 records out
17179869184 bytes (17 GB) copied, 11.0601 s, 1.6 GB/s

Increased cached data by 16 GB

16+0 records in
16+0 records out
17179869184 bytes (17 GB) copied, 5.01643 s, 3.4 GB/s
--- snip ---

This is almost the same result as in the unconstrained case (i.e. without
ConstrainRAMSpace=yes set in cgroups.conf) as the amount of memory requested
for the job (18 GB) is large enough to allow the file to be fully cached in
memory.

I do not think this is an issue with Slurm itself but how cgroups
are supposed to work. However, I wonder how others cope with this.

Maybe we have to teach our users to also consider page cache when
requesting a certain amount of memory for their jobs?

Any comment or idea would be highly appreciated.

Thank you in advance.

Best regards
Jürgen

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471



More information about the slurm-users mailing list