[slurm-users] About memory limits with srun
Patrick Begou
Patrick.Begou at univ-grenoble-alpes.fr
Mon Mar 22 06:25:28 UTC 2021
Hi all,
I sent this mail from a bad email address this week-end. I apologize if
it is published duplicate (but not found in the archive yet).
May be this is a basic question but I'm stuck with it. I'm quite new in
managing a small cluster with slurm instead of a local batch scheduler.
On the nodes I've set memory limits in slurm.conf.
DefMemPerCPU=2048
MaxMemPerCPU=4096
Requesting 1.2GB of RAM works:
srun --ntasks-per-node=1 --mem-per-cpu=1500M -p tenibre-gpu --pty
bash -i
and my testcase can allocate until 1.5GB:
./a.out
allocation de 1000Mo.........Ok
....
allocation de 1419Mo.........Ok
allocation de 1524Mo.........Ok
Killed
Now I would like to use more memory than MaxMemPerCPU:
srun --ntasks-per-node=1 --mem-per-cpu=12G -p tenibre-gpu --pty bash -i
So, if I understand the documentation, as mem-per-cpu > MaxMemPerCPU
this is a limitation applied to the task and it agregates cpu and
memory. The squeue command show 3 cpu agregated on the node to reach the
3*MaxMemPerCPU memory requested so all seams correct.
JOBID PARTITION NAME USER ST
TIME START_TIME TIME_LIMIT CPUS NODELIST(REASON)
497 tenibre-gpu bash begou R 1:23
2021-03-20T14:42:47 12:00:00 3 tenibre-gpu-0
But my task is unable to exceed the MaxMemPerCPU value ?
./a.out
allocation de 1000Mo.........Ok
....
allocation de 4145Mo.........Ok
allocation de 4250Mo.........Ok
Killed
So, I'm wrong somewhere but ?
Running the testcase in a ssh sessions (ssh as root then su as a basic
user) allows using more memory so it is related to my bad slurm setup/use
Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210322/7ba8a1ad/attachment.htm>
More information about the slurm-users
mailing list