<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
It's not so much whether a job may or may not access the GPU but rather which GPU(s) is(are) included in $CUDA_VISIBLE_DEVICES. That is what controls what our CUDA jobs can see and therefore use (within any cgroups constraints, of course). In my case, Slurm
is sometimes setting $CUDA_VISIBLE_DEVICES to a GPU that is not in the Slurm configuration because it is intended only for driving the display and not GPU computations.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Thanks for your thoughts!</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Steve<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of Christopher Samuel <chris@csamuel.org><br>
<b>Sent:</b> Friday, July 14, 2023 1:57 PM<br>
<b>To:</b> slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com><br>
<b>Subject:</b> Re: [slurm-users] Unconfigured GPUs being allocated</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">[You don't often get email from chris@csamuel.org. Learn why this is important at
<a href="https://aka.ms/LearnAboutSenderIdentification">https://aka.ms/LearnAboutSenderIdentification</a> ]<br>
<br>
---- External Email: Use caution with attachments, links, or sharing data ----<br>
<br>
<br>
On 7/14/23 10:20 am, Wilson, Steven M wrote:<br>
<br>
> I upgraded Slurm to 23.02.3 but I'm still running into the same problem.<br>
> Unconfigured GPUs (those absent from gres.conf and slurm.conf) are still<br>
> being made available to jobs so we end up with compute jobs being run on<br>
> GPUs which should only be used<br>
<br>
I think this is expected - it's not that Slurm is making them available,<br>
it's that it's unaware of them and so doesn't control them in the way it<br>
does for the GPUs it does know about. So you get the default behaviour<br>
(any process can access them).<br>
<br>
If you want to stop them being accessed from Slurm you'd need to find a<br>
way to prevent that access via cgroups games or similar.<br>
<br>
All the best,<br>
Chris<br>
--<br>
Chris Samuel : <a href="http://www.csamuel.org/">https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&data=05%7C01%7Cstevew%40purdue.edu%7C6fba97485b73413521d208db8494160a%7C4130bd397c53419cb1e58758d6d63f21%7C0%7C0%7C638249543794377751%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VslW51ree1Ibt3xfYyy99Aj%2BREZh7BqpM6Ipg3jAM84%3D&reserved=0</a>
: Berkeley, CA, USA<br>
<br>
<br>
</div>
</span></font></div>
</body>
</html>