[slurm-users] Slurm 17.11 and configuring backfill and oversubscribe to allow concurrent processes

Thu Feb 27 19:23:06 UTC 2020

>
> If that 32 GB is main system RAM, and not GPU RAM, then yes. Since our GPU
> nodes are over-provisioned in terms of both RAM and CPU, we end up using
> the excess resources for non-GPU jobs.
>

No it's GPU RAM

> If that 32 GB is GPU RAM, then I have no experience with that, but I
> suspect MPS would be required.

OK so does SLURM support MPS and if so what version? Would we need to
enable cons_tres and use, e.g., --mem-per-gpu?

On Thu, Feb 27, 2020 at 12:46 PM Renfro, Michael <Renfro at tntech.edu> wrote:

> If that 32 GB is main system RAM, and not GPU RAM, then yes. Since our GPU
> nodes are over-provisioned in terms of both RAM and CPU, we end up using
> the excess resources for non-GPU jobs.
>
> If that 32 GB is GPU RAM, then I have no experience with that, but I
> suspect MPS would be required.
>
> > On Feb 27, 2020, at 11:14 AM, Robert Kudyba <rkudyba at fordham.edu> wrote:
> >
> > So looking at the new cons_tres option at
> https://urldefense.proofpoint.com/v2/url?u=https-3A__slurm.schedmd.com_SLUG19_GPU-5FScheduling-5Fand-5FCons-5FTres.pdf&d=DwIFAg&c=aqMfXOEvEJQh2iQMCb7Wy8l0sPnURkcqADc2guUW8IM&r=X0jL9y0sL4r4iU_qVtR3lLNo4tOL1ry_m7-psV3GejY&m=kiUokcO6jsOTlmQrVWWzmLutg5C_kIEUNEzcEye6pkQ&s=SB-TTKR1B3MGmXXHiDzz9OwguSjQdp2LaOTyJFfpep8&e=
> , would we be able to use, e.g., --mem-per-gpu= Memory per allocated GPU,
> and it a user allocated --mem-per-gpu=8, and the V100 we have is 32 GB,
> will subsequent jobs be able to use the remaining 24 GB?
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200227/d07267dc/attachment.htm>