[slurm-users] Can't run jobs after upgrade to 17.11.5 due to memory?
Roberts, John E.
jeroberts at anl.gov
Mon Jun 11 15:12:44 MDT 2018
Nothing I assume isn't correct:
DefMemPerNode = UNLIMITED
MaxMemPerNode = UNLIMITED
MemLimitEnforce = Yes
PropagateResourceLimitsExcept = MEMLOCK
CPU vars aren't set and never were.
On 6/11/18, 4:09 PM, "slurm-users on behalf of Renfro, Michael" <slurm-users-bounces at lists.schedmd.com on behalf of Renfro at tntech.edu> wrote:
Anything in particular set for DefMemPerCPU in your slurm.conf?
> On Jun 11, 2018, at 3:50 PM, Roberts, John E. <jeroberts at anl.gov> wrote:
> Seeing this after an upgrade today. I now can't get any jobs to run. Things were fin before the upgrade. Any Ideas?
> slurmstepd: error: Job 535721 exceeded memory limit (1160 > 1024), being killed
> slurmstepd: error: Exceeded job memory limit
More information about the slurm-users