[slurm-users] Memory oversubscription and sheduling
cory.holcomb at broadcom.com
Mon May 7 07:58:38 MDT 2018
Thank you, for the reply I was beginning to wonder if my message was seen.
While I understand how batch systems work, if you have a system daemon that
develops a memory leak and consumes the memory outside of allocation.
Not checking the used memory on the box before dispatch seems like a good
way to black hole a bunch of jobs.
On Sat, May 5, 2018 at 7:21 AM, Chris Samuel <chris at csamuel.org> wrote:
> On Thursday, 26 April 2018 3:28:19 AM AEST Cory Holcomb wrote:
> > It appears that I have a configuration that only takes into account the
> > allocated memory before dispatching.
> With batch systems the idea is for the users to set constraints for their
> so the scheduler can backfill other jobs onto nodes knowing how much
> they can rely on.
> So really the emphasis is on making the users set good resource requests
> as memory) and for the batch system to terminate jobs that exceed them (or
> arrange for the kernel to constrain them to that amount via cgroups).
> Hope that helps!
> Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users