[slurm-users] Can't run jobs after upgrade to 17.11.5 due to memory?
Roberts, John E.
jeroberts at anl.gov
Mon Jun 11 14:50:43 MDT 2018
Hi,
Seeing this after an upgrade today. I now can't get any jobs to run. Things were fin before the upgrade. Any Ideas?
slurmstepd: error: Job 535721 exceeded memory limit (1160 > 1024), being killed
slurmstepd: error: Exceeded job memory limit
ulimit shows:
$ ulimit -a | grep -i mem
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
virtual memory (kbytes, -v) unlimited
but ulimit from slurm shows:
$ srun bash -c "ulimit -a" | grep -i mem
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) 1024
virtual memory (kbytes, -v) unlimited
This is CentOS 7 and this is set:
$ grep -i mem /etc/systemd/system/multi-user.target.wants/slurmd.service
LimitMEMLOCK=infinity
Thanks!
--
John Roberts
HPC Systems Administrator
More information about the slurm-users
mailing list