<div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Loris, <br></div><div><br></div><div>You are correct! Instead of using nvidia-smi as a check, I confirmed the GPU allocation by printing out <br></div><div>the environment variable, CUDA_VISIBILE_DEVICES, and it was as expected. <br></div><div><br></div><div>Thanks for your help! <br></div><div><br></div></div></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 14, 2021 at 12:18 AM Loris Bennett <<a href="mailto:loris.bennett@fu-berlin.de">loris.bennett@fu-berlin.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Abhiram,<br>

<br>

Abhiram Chintangal <<a href="mailto:achintangal@berkeley.edu" target="_blank">achintangal@berkeley.edu</a>> writes:<br>

<br>

> Hello, <br>

><br>

> I recently set up a small cluster at work using Warewulf/Slurm. Currently, I am not able to get the scheduler to <br>

> work well with GPU's (Gres). <br>

><br>

> While slurm is able to filter by GPU type, it allocates all the GPU's on the node. See below:<br>

><br>

>  [abhiram@whale ~]$ srun --gres=gpu:p100:2 -n 1 --partition=gpu nvidia-smi --query-gpu=index,name --format=csv<br>

>  index, name<br>

>  0, Tesla P100-PCIE-16GB<br>

>  1, Tesla P100-PCIE-16GB<br>

>  2, Tesla P100-PCIE-16GB<br>

>  3, Tesla P100-PCIE-16GB<br>

>  [abhiram@whale ~]$ srun --gres=gpu:titanrtx:2 -n 1 --partition=gpu nvidia-smi --query-gpu=index,name --format=csv<br>

>  index, name<br>

>  0, TITAN RTX<br>

>  1, TITAN RTX<br>

>  2, TITAN RTX<br>

>  3, TITAN RTX<br>

>  4, TITAN RTX<br>

>  5, TITAN RTX<br>

>  6, TITAN RTX<br>

>  7, TITAN RTX<br>

><br>

> I am fairly new to Slurm and still figuring out my way around it. I would really appreciate any help with this.<br>

><br>

> For your reference, I attached the slurm.conf and gres.conf files. <br>

<br>

I think this is expected, since nvidia-smi does not actually use the<br>

GPUs, but just returns information on their usage.<br>

<br>

A better test would be to run a simple test which really does run on,<br>

say, two GPU and then, while the job is running, log into the GPU node<br>

and run <br>

<br>

  nvidia-smi --query-gpu=index,name,utilization.gpu --format=csv<br>

<br>

Cheers,<br>

<br>

Loris<br>

<br>

-- <br>

Dr. Loris Bennett (Hr./Mr.)<br>

ZEDAT, Freie Universität Berlin         Email <a href="mailto:loris.bennett@fu-berlin.de" target="_blank">loris.bennett@fu-berlin.de</a><br>

<br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><pre cols="72">Abhiram Chintangal

QB3 Nogales Lab 

Bioinformatics Specialist @ Howard Hughes Medical Institute

University of California Berkeley 

708D Stanley Hall, Berkeley, CA 94720

Phone (510)666-3344<br></pre>

</div></div></div></div></div></div></div>