<div dir="ltr"><div dir="ltr">This doesn't directly answer your question, but in Feb last year on the ML there was a discussion about limiting user resources on login node (Stopping compute usage on login nodes). Some of the suggestions included the use of cgroups to do so, and it's possible that those methods could be extended to limit access to GPUs, so it might be worth looking into.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, 18 May 2019 at 00:28, Dave Evans <<a href="mailto:rdevans@ece.ubc.ca">rdevans@ece.ubc.ca</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><br></div><div>We are using a single system "cluster" and want some control of fair-use with the GPUs. The sers are not supposed to be able to use the GPUs until they have allocated the resources through slurm. We have no head node, so slurmctld, slurmdbd, and slurmd are all run on the same system. <br></div><div><br></div><div>I have a configuration working now such that the GPUs can be scheduled and allocated.</div><div>However logging into the system before allocating GPUs gives full access to all of them. <br></div><div><br></div><div>I would like to configure slurm cgroups to disable access to GPUs until they have been allocated.</div><div><br></div><div>On first login, I get:</div><div>nvidia-smi -q | grep UUID<br> GPU UUID : GPU-6076ce0a-bc03-a53c-6616-0fc727801c27<br> GPU UUID : GPU-5620ec48-7d76-0398-9cc1-f1fa661274f3<br> GPU UUID : GPU-176d0514-0cf0-df71-e298-72d15f6dcd7f<br> GPU UUID : GPU-af03c80f-6834-cb8c-3133-2f645975f330<br> GPU UUID : GPU-ef10d039-a432-1ac1-84cf-3bb79561c0d3<br> GPU UUID : GPU-38168510-c356-33c9-7189-4e74b5a1d333<br> GPU UUID : GPU-3428f78d-ae91-9a74-bcd6-8e301c108156<br> GPU UUID : GPU-c0a831c0-78d6-44ec-30dd-9ef5874059a5</div><div><br></div><div><br></div><div>And running from the queue:</div><div>srun -N 1 --gres=gpu:2 nvidia-smi -q | grep UUID<br> GPU UUID : GPU-6076ce0a-bc03-a53c-6616-0fc727801c27<br> GPU UUID : GPU-5620ec48-7d76-0398-9cc1-f1fa661274f3<br></div><div><br></div><div><br></div><div>Pastes of my config files are:</div><div>## slurm.conf ##<br></div><div><a href="https://pastebin.com/UxP67cA8" target="_blank">https://pastebin.com/UxP67cA8</a></div><div><br><div><b>## cgroup.conf ##<br></b></div><div>CgroupAutomount=yes <br>CgroupReleaseAgentDir="/etc/slurm/cgroup" <br><br>ConstrainCores=yes <br>ConstrainDevices=yes<br>ConstrainRAMSpace=yes<br>#TaskAffinity=yes<br></div><div><br></div></div><div><b>## cgroup_allowed_devices_file.conf ## </b><br></div><div>/dev/null<br>/dev/urandom<br>/dev/zero<br>/dev/sda*<br>/dev/cpu/*/*<br>/dev/pts/*<br>/dev/nvidia*<br></div></div></div></div></div></div></div></div></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div style="color:rgb(34,34,34);background-color:rgb(255,255,255)"><b style="font-family:Arial;font-size:small">Nathan<span style="color:rgb(81,80,88)"> </span>Harper</b><span style="font-family:Arial;font-size:small;color:rgb(0,174,255)"> // </span><font size="2" style="font-family:Arial">IT Systems Lead</font></div><div style="color:rgb(34,34,34);background-color:rgb(255,255,255)"><br><font face="Arial" size="1"><strong><span style="color:rgb(0,174,255)">e: </span></strong><a href="mailto:nathan.harper@cfms.org.uk" style="color:rgb(81,80,88)" target="_blank">nathan.harper@cfms.org.uk</a> <span style="color:rgb(0,174,255)"> </span><font color="#222222"> </font><font color="#00aeff"><b>t</b>:</font><font color="#222222"> 0117 906 1104</font> <font color="#222222"> </font><font color="#00aeff"><b>m</b>:</font><font color="#222222"> 0787 551 0891 </font><span style="color:rgb(0,174,255)"> <strong>w: </strong></span><a href="http://www.cfms.org.uk/" rel="noreferrer" style="color:rgb(81,80,88)" target="_blank">www.cfms.org.uk</a> <span style="color:rgb(0,174,255)"> </span></font></div><div style="color:rgb(34,34,34);background-color:rgb(255,255,255)"><font face="Arial" size="1"><span style="color:rgb(81,80,88)">CFMS Services Ltd</span><span style="color:rgb(0,174,255)"> // </span><span style="color:rgb(81,80,88)">Bristol & Bath Science Park</span> <span style="color:rgb(0,174,255)">//</span> <span style="color:rgb(81,80,88)">Dirac Crescent</span> <span style="color:rgb(0,174,255)">//</span> <span style="color:rgb(81,80,88)">Emersons Green</span> <span style="color:rgb(0,174,255)">//</span> <span style="color:rgb(81,80,88)">Bristol</span> <span style="color:rgb(0,174,255)">//</span> <span style="color:rgb(81,80,88)">BS16 7FR</span> </font></div><div style="color:rgb(34,34,34);background-color:rgb(255,255,255)"><font face="Verdana" style="font-size:small"><img src="http://cfms.org.uk/images/logo.png"> </font></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;background-color:rgb(255,255,255)"><span style="font-family:Verdana"><span style="color:rgb(81,80,88);font-size:xx-small">CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd</span> <br><span style="color:rgb(81,80,88);font-size:xx-small">CFMS Services Ltd registered office // </span></span><font color="#515058" face="Verdana"><span style="font-size:xx-small">43 Queens Square // Bristol // BS1 4QP</span></font></div></div>