<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html;

      charset=ISO-8859-2">

  </head>

  <body>

    <p>Hello, I'm reviving a bit of old thread, but I just noticed I

      don't see my January 2021 message in the archives, so I'm sending

      it again now that the issue again got live on our side.</p>

    <p><br>

    </p>

    <p>To quickly recap, we want to add permissions not only to

      /dev/nvidia* devices based on the requested gres, but also to the

      corresponding /dev/dri/card* and /dev/dri/renderD* devices - they

      are all connected to the same GPU, but the additional two allow

      using the card for rendering instead of CUDA computations etc. I

      had some idea how to achieve that without changing SLURM codebase,

      and I got something that could almost work. It probably just needs

      some polishing. Could anybody please comment whether the proposed

      solution is a good idea?</p>

    <p><br>

    </p>

    <p>The 15 Jan 2021 message:<br>

    </p>

    <p><br>

    </p>

    <p>So I started thinking if this could not be somehow handled by a

      prologue script and direct cgroup manipulation? I'm no expert in

      either, so please check my lines of thoughts.</p>

    <p><br>

    </p>

    <pre class="lang-sh s-code-block hljs bash"><code>#!/bin/bash

PATH=/usr/bin/:/bin

gpus=<span class="hljs-variable">${SLURM_STEP_GPUS:-<span class="hljs-variable">$SLURM_JOB_GPUS</span>}  # or CUDA_VISIBLE_DEVICES when run inside the cgroup?</span>

cgroup=$(cat /proc/self/cgroup | grep devices | cut -d: -f3)  # or something else?

# blacklist all DRM devices (major 226)

cgset -r devices.deny="a 226:* rwm" devices:${cgroup}

for NVIDIA_SMI_ID in </code><code><code><span class="hljs-variable">${gpus//,/ }</span></code>; do

  # find on which PCI path does this device sit

  pci_id=$(nvidia-smi -i $NVIDIA_SMI_ID --query-gpu=pci.bus_id --format=noheader,csv | tail -c+5 | tr '[:upper:]' '[:lower:]')

  # find the DRM devices sitting on the same PCI bus

  card=$(ls /sys/bus/pci/devices/${pci_id}/drm/ | grep card | xargs basename)

  render=$(ls /sys/bus/pci/devices/${pci_id}/drm/ | grep renderD | xargs basename)

  # allow access to the DRM devices

  [ -n "${card}" ] && </code><code><code>cgset -r devices.allow="c $(cat /sys/class/drm/${card}/dev) rw" devices:${cgroup} && echo "Allowed /dev/dri/${card} DRI device access"</code>

</code><code><code>  [ -n "${render}" ] && </code><code><code>cgset -r devices.allow="c $(cat /sys/class/drm/${render}/dev) rw" devices:${cgroup}</code></code></code><code><code><code> && echo "Allowed /dev/dri/${render} render node access"</code></code>

done

</code></pre>

    <p>Now I wonder whether this should be Prolog=, TaskProlog= or

      something else (that would also change whether I look at

      CUDA_VISIBLE_DEVICES or SLURM_STEP_GPUS, and how I figure out the

      cgroup name). I guess that were this script run as the invoking

      user, then nothing would prevent him from gaining access to all

      devices again. So I'd incline to treat it as a Prolog= script run

      by root. How would I get the cgroup ID then? Compose it from parts

      as mentioned in the slurm cgroups docs?

      (/cgroup/cpuset/slurm/uid_100/job_123/step_0/task_2) Or is there a

      more reliable way?</p>

    <p><br>

    </p>

    <p>A related but offtopic idea popped up in my head when thinking

      about GPUs. Most of them are actually a consolidation of more

      devices like stream processors, encoders, decoders, raytraces,

      shaders, memory etc. Could it be possible (in future) to actually

      offer each of these pieces as a different gres? The problem is

      most of them do not have any special file which the user could

      lock to tell the others he's playing there now. So it'd probably

      require support at the level of cgroup implemetation, which, in

      turn, would require changing all GPU drivers. And it would require

      being able to request just chunks of GPU memory (not sure if

      that's possible right now, but I think I saw some pull request

      about that).<br>

    </p>

    <p><br>

    </p>

    <p>Thank you for hints!</p>

    <p><br>

    </p>

    <p>Martin<br>

    </p>

    <p><br>

    </p>

    <p>Dne 21.10.2020 v 19:09 Martin Pecka napsal(a): </p>

    <blockquote type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=ISO-8859-2">

      <p>Or maybe could this be "emulated" by a set of 3 GRES per card

        that are "linked" together? I.e. rules like "if the user

        requests GRES /dev/dri/card0, he will also automatically need to

        claim /dev/dri/renderD128 and /dev/nvidia0"? </p>

      <p><br>

      </p>

      <p>Dne 21.10.2020 v 18:52 Daniel Letai napsal(a): </p>

      <blockquote type="cite">

        <meta http-equiv="Content-Type" content="text/html;

          charset=ISO-8859-2">

        <style id="bidiui-paragraph-margins" type="text/css">body p { margin-bottom: 0cm; margin-top: 0pt; }</style>

        <p>Take a look at <a class="moz-txt-link-freetext"

            href="https://github.com/SchedMD/slurm/search?q=dri%2F"

            moz-do-not-send="true">https://github.com/SchedMD/slurm/search?q=dri%2F</a></p>

        <p>If the ROCM-SMI API is present, using AutoDetect=rsmi in

          gres.conf might be enough, if I'm reading this right.</p>

        <p><br>

        </p>

        <p>Of course, this assumes the cards in question are AMD and not

          NVIDIA.</p>

        <p><br>

        </p>

        <div class="moz-cite-prefix">On 20/10/2020 23:58, Mgr. Martin

          Pecka wrote:<br>

        </div>

        <blockquote type="cite"

          cite="mid:039ab76d-b5e8-972e-b5bd-5b42bfe71cf3@fel.cvut.cz">Pinging

          this topic again. Nobody has an idea how to define multiple

          files to be treated as a single gres? <br>

          <br>

          Thank you for help, <br>

          <br>

          Martin Pecka <br>

          <br>

          Dne 4.9.2020 v 21:29 Martin Pecka napsal(a): <br>

          <br>

          <blockquote type="cite">Hello, we want to use EGL backend for

            accessing OpenGL without the need for Xorg. This approach

            requires access to devices /dev/dri/card* and

            /dev/dri/renderD* . Is there a way to give access to these

            devices along with /dev/nvidia* which we use for CUDA?

            Ideally as a single generic resource that would give

            permissions to all three files at once. <br>

            <br>

            Thank you for any tips. <br>

            <br>

          </blockquote>

          <br>

          <br>

        </blockquote>

      </blockquote>

      <p><br>

      </p>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>