<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
That would certainly do it. If you look at the slurmctld log when it comes up, it will say that it's marking that node as invalid because it has less (0) gres resources then you say it should have. That's because slurmd on that node will come up and say "What
gres resources??"</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
For testing purposes, you can just create a dummy file on the node, then in gres.conf, point to that file as the "graphics file" interface. As long as you don't try to actually use it as a graphics file, that should be enough for that node to think it has
gres/gpu resources. That's what I do in my vagrant slurm cluster.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Rob</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> slurm-users <slurm-users-bounces@lists.schedmd.com> on behalf of Xaver Stiensmeier <xaverstiensmeier@gmx.de><br>
<b>Sent:</b> Monday, July 17, 2023 9:43 AM<br>
<b>To:</b> slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com><br>
<b>Subject:</b> Re: [slurm-users] GRES and GPUs</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">Hi Hermann,<br>
<br>
Good idea, but we are already using `SelectType=select/cons_tres`. After<br>
setting everything up again (in case I made an unnoticed mistake), I saw<br>
that the node got marked STATE=inval.<br>
<br>
To be honest, I thought I can just claim that a node has a gpu even if<br>
it doesn't have one - just for testing purposes. Could this be the issue?<br>
<br>
Best regards,<br>
Xaver Stiensmeier<br>
<br>
On 17.07.23 14:11, Hermann Schwärzler wrote:<br>
> Hi Xaver,<br>
><br>
> what kind of SelectType are you using in your slurm.conf?<br>
><br>
> Per <a href="https://slurm.schedmd.com/gres.html">https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fgres.html&data=05%7C01%7Crug262%40psu.edu%7Cbc4b7775beae4d2e376c08db86cbfc7b%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638251982928987379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PqvE6pL2sKSb6KxLngi0sbm6qhIv8MRYTmUM%2Bgq1hrI%3D&reserved=0</a>
you have to consider:<br>
> "As for the --gpu* option, these options are only supported by Slurm's<br>
> select/cons_tres plugin."<br>
><br>
> So you can use "--gpus ..." only when you state<br>
> SelectType = select/cons_tres<br>
> in your slurm.conf.<br>
><br>
> But "--gres=gpu:1" should work always.<br>
><br>
> Regards<br>
> Hermann<br>
><br>
><br>
> On 7/17/23 13:43, Xaver Stiensmeier wrote:<br>
>> Hey,<br>
>><br>
>> I am currently trying to understand how I can schedule a job that<br>
>> needs a GPU.<br>
>><br>
>> I read about GRES <a href="https://slurm.schedmd.com/gres.html">https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fgres.html&data=05%7C01%7Crug262%40psu.edu%7Cbc4b7775beae4d2e376c08db86cbfc7b%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638251982928987379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PqvE6pL2sKSb6KxLngi0sbm6qhIv8MRYTmUM%2Bgq1hrI%3D&reserved=0</a>
and tried to use:<br>
>><br>
>> GresTypes=gpu<br>
>> NodeName=test Gres=gpu:1<br>
>><br>
>> But calling - after a 'sudo scontrol reconfigure':<br>
>><br>
>> srun --gpus 1 hostname<br>
>><br>
>> didn't work:<br>
>><br>
>> srun: error: Unable to allocate resources: Invalid generic resource<br>
>> (gres) specification<br>
>><br>
>> so I read more <a href="https://slurm.schedmd.com/gres.conf.html">https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fgres.conf.html&data=05%7C01%7Crug262%40psu.edu%7Cbc4b7775beae4d2e376c08db86cbfc7b%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638251982928987379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aCh8X6QtJpRlIWxo%2BQxL85CC%2FbIo6bDxAY%2Fd5B9khmE%3D&reserved=0</a>
but that<br>
>> didn't really help me.<br>
>><br>
>><br>
>> I am rather confused. GRES claims to be generic resources but then it<br>
>> comes with three defined resources (GPU, MPS, MIG) and using one of<br>
>> those didn't work in my case.<br>
>><br>
>> Obviously, I am misunderstanding something, but I am unsure where to<br>
>> look.<br>
>><br>
>><br>
>> Best regards,<br>
>> Xaver Stiensmeier<br>
>><br>
><br>
<br>
</div>
</span></font></div>
</body>
</html>