<div dir="ltr"><div><div><div><div><div>Hi fellow slurm users,<br></div>We have been struggling for a while with understanding how MaxRSS is reported.<br><br>This because jobs often die with MaxRSS not even approaching 10% of the requested memory sometimes.<br><br></div>I just found the following document:<br><a href="https://research.csc.fi/-/a">https://research.csc.fi/-/a</a><br><br></div>It says:<br>"<strong>maxrss </strong>= maximum amount of memory used at any time by
any process in that job. This applies directly for serial jobs. For
parallel jobs you need to multiply with the number of cores (max 16 or
24 as this is reported only for that node that used the most memory)"<br><br></div>While 'man sacct' says:<br>"Maximum resident set size of all tasks in job."<br><br></div>Which explanation is correct? How should I be interpreting MaxRSS?<br><div><div><div><div><div><div><br></div><div>Thanks,<br></div><div>Eli<br></div></div></div></div></div></div></div>