[slurm-users] Over-subscription for a GRES type

Paul Browne pfb29 at cam.ac.uk
Fri Nov 23 11:43:49 MST 2018


Ah, of course, that makes sense, thanks. I guess if we're constraining the
devices into job specific cgroups then the Slurmd on the node may know what
device is assigned to what job and be able to interrogate resource usage
from that but there's no mechanism to do it anything other than that.

On Fri, 23 Nov 2018, 6:36 pm Mark Hahn <hahn at mcmaster.ca wrote:

> > We have a use-case in that the GRES being tracked on a particular
> partition
> >are GPU cards, but aren't being used by applications that would require
> them
> >exclusively (lightweight direct rendering rather than GP-GPU/CUDA
>
> the issue is that slurm/kernel can't arbitrate resources on the GPU,
> so oversubscription is likely to run out of device memory or SMs, no?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181123/2a31f075/attachment-0001.html>


More information about the slurm-users mailing list