[slurm-users] Can't run jobs after upgrade to 17.11.5 due to memory?

Roberts, John E. jeroberts at anl.gov
Mon Jun 11 15:12:44 MDT 2018


Nothing I assume isn't correct:

DefMemPerNode           = UNLIMITED
MaxMemPerNode           = UNLIMITED
MemLimitEnforce         = Yes
PropagateResourceLimitsExcept = MEMLOCK

CPU vars aren't set and never were.

Thanks!
John 

On 6/11/18, 4:09 PM, "slurm-users on behalf of Renfro, Michael" <slurm-users-bounces at lists.schedmd.com on behalf of Renfro at tntech.edu> wrote:

    Anything in particular set for DefMemPerCPU in your slurm.conf?
    
    > On Jun 11, 2018, at 3:50 PM, Roberts, John E. <jeroberts at anl.gov> wrote:
    > 
    > Hi,
    > 
    >    Seeing this after an upgrade today. I now can't get any jobs to run. Things were fin before the upgrade. Any Ideas?
    > 
    >    slurmstepd: error: Job 535721 exceeded memory limit (1160 > 1024), being killed
    >    slurmstepd: error: Exceeded job memory limit
    
    
    



More information about the slurm-users mailing list