<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Here's how we handle this here: <br>
</p>
<p><br>
</p>
<p>Create a separate partition named debug that also contains that
node. Give the debug partition a very short timelimit, say 30 - 60
minutes. Long enough for debugging, but too short to do any real
work. Make the priority of the debug partition much higher than
the regular partition. With that set up, they may not get a GPU
right away, but their job should go to the head of the queue so as
soon as one becomes available, their job will get it. <br>
</p>
<p><br>
</p>
<p>--<br>
Prentice<br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 4/24/19 11:06 AM, Mike Cammilleri
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:DM6PR06MB3882F25A39239B0A6983BF1BFB3C0@DM6PR06MB3882.namprd06.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
Hi everyone,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
We have a single node with 8 gpus. Users often pile up lots of
pending jobs and are using all 8 at the same time, but for a
user who just wants to do a short run debug job and needs one of
the gpus, they are having to wait too long for a gpu to free up.
Is there a way with gres.conf or qos to limit the number of
concurrent gpus in use for all users? Most jobs submitted are
single jobs, so they request a gpu with --gres=gpu:1 but submit
many (no array), and our gres.conf looks like the following</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<span>Name=gpu File=/dev/nvidia0 #CPUs=0,1,2,3<br>
</span>
<div>Name=gpu File=/dev/nvidia1 #CPUs=4,5,6,7<br>
</div>
<div>Name=gpu File=/dev/nvidia2 #CPUs=8,9,10,11<br>
</div>
<div>Name=gpu File=/dev/nvidia3 #CPUs=12,13,14,15<br>
</div>
<div>Name=gpu File=/dev/nvidia4 #CPUs=16,17,18,19<br>
</div>
<div>Name=gpu File=/dev/nvidia5 #CPUs=20,21,22,23<br>
</div>
<div>Name=gpu File=/dev/nvidia6 #CPUs=24,25,26,27<br>
</div>
<div>Name=gpu File=/dev/nvidia7 #CPUs=28,29,30,31<br>
</div>
<span></span><br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
I thought of insisting that they submit the jobs as an array and
limit with %7, but maybe there's a more elegant solution using
the config.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
Any tips appreciated.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div id="Signature">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size:12pt;
color:#000000; font-family:Calibri,Helvetica,sans-serif">
<p style="margin-top: 0px; margin-bottom: 0px;margin-top:0;
margin-bottom:0">Mike Cammilleri</p>
<p style="margin-top: 0px; margin-bottom: 0px;margin-top:0;
margin-bottom:0">Systems Administrator</p>
<p style="margin-top: 0px; margin-bottom: 0px;margin-top:0;
margin-bottom:0">Department of Statistics | <span
style="font-size:12pt">UW-Madison</span></p>
<p style="margin-top: 0px; margin-bottom: 0px;margin-top:0;
margin-bottom:0"><span style="font-size:12pt">1300
University Ave | Room 1280<br>
608-263-6673 | <a class="moz-txt-link-abbreviated" href="mailto:mikec@stat.wisc.edu">mikec@stat.wisc.edu</a></span></p>
</div>
</div>
</blockquote>
</body>
</html>