[slurm-users] Way MaxRSS should be interpreted

E.S. Rosenberg esr+slurm-dev at mail.hebrew.edu
Tue Apr 17 06:41:32 MDT 2018


Hi Loris,
Thanks for your explanation!
I would have interpreted as max(sum()).

Is there a way to get max(sum()) or at least sum form of sum()? The
assumption that all processes are peaking at the same value is not a valid
one unless all threads have essentially the same workload...
Thanks again!
Eli

On Tue, Apr 17, 2018 at 2:09 PM, Loris Bennett <loris.bennett at fu-berlin.de>
wrote:

> Hi Eli,
>
> "E.S. Rosenberg" <esr+slurm-dev at mail.hebrew.edu> writes:
>
> > Hi fellow slurm users,
> > We have been struggling for a while with understanding how MaxRSS is
> reported.
> >
> > This because jobs often die with MaxRSS not even approaching 10% of the
> requested memory sometimes.
> >
> > I just found the following document:
> > https://research.csc.fi/-/a
> >
> > It says:
> > "maxrss = maximum amount of memory used at any time by any process in
> that job. This applies directly for serial jobs. For parallel jobs you need
> to multiply with the number of cores (max 16 or 24 as this is
> > reported only for that node that used the most memory)"
> >
> > While 'man sacct' says:
> > "Maximum resident set size of all tasks in job."
> >
> > Which explanation is correct? How should I be interpreting MaxRSS?
>
> As far as I can tell, both explanations are correct, but the
> text in 'man acct' is confusing.
>
>   "Maximum resident set size of all tasks in job."
>
> is analogous to
>
>   "maximum height of all people in the room"
>
> rather than
>
>   "total height of all people in the room"
>
> More specifically it means
>
>   "Maximum individual resident set size out of the group of resident set
>   sizes associated with all tasks in job."
>
> It doesn't mean
>
>   "Sum of the resident set sizes of all the tasks"
>
> I'm a native English-speaker and I keep on stumbling over this in 'man
> sacct' and then remembering that I have already worked out how it was
> supposed to be interpreted.
>
> My suggestion for improving this would be
>
>   "Maximum individual resident set size of all resident set sizes
>   associated with the tasks in job."
>
> It's a little clunky, but I hope it is clearer.
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180417/3b87826f/attachment.html>


More information about the slurm-users mailing list