[slurm-users] [ext] Enforce gpu usage limits (with GRES?)

Thu Feb 2 17:51:57 UTC 2023

Hi,

Thanks for the reply. Yes, your advice helped! Much obliged. Not only was
cgroups config necessary, but the option

ConstrainDevices=yes

in cgroup.conf was necessary to enforce the gpu gres. Now, not adding a
gres parameter to srun causes gpu jobs to fail. An improvement!

Although, I still can't keep out gpu jobs from the "CPU" partition. Is
there a way to link a partition to a GRES or something?

Alternatively, can I define two nodenames in slurm.conf that point to the
same physical node, but only one of them has the gpu GRES? That way, I can
link the GPU partition to the gres-configged nodename only.

Thanks in advance,
AR

*PS*: If the slurm devs are reading this, may I suggest that perhaps it
would be a good idea to add a reference to cgroups in the gres
documentation page?

On Thu, 2 Feb 2023 at 16:52, Holtgrewe, Manuel <
manuel.holtgrewe at bih-charite.de> wrote:

> Hi,
>
>
> if by "share the GPU" you mean exclusive allocation to a single job then,
> I believe, you are missing cgroup configuration for isolating access to the
> GPU.
>
>
> Below the relevant parts (I believe) of our configuration.
>
>
> There also is a way of time- and space-slice GPUs but I guess you should
> get things setup without slicing.
>
>
> I hope this helps.
>
>
> Manuel
>
>
> ==> /etc/slurm/cgroup.conf <==
> # https://bugs.schedmd.com/show_bug.cgi?id=3701
> CgroupMountpoint="/sys/fs/cgroup"
> CgroupAutomount=yes
> AllowedDevicesFile="/etc/slurm/cgroup_allowed_devices_file.conf"
>
> ==> /etc/slurm/cgroup_allowed_devices_file.conf <==
> /dev/null
> /dev/urandom
> /dev/zero
> /dev/sda*
> /dev/cpu/*/*
> /dev/pts/*
> /dev/nvidia*
>
> ==> /etc/slurm/slurm.conf <==
>
> ProctrackType=proctrack/cgroup
>
> # Memory is enforced via cgroups, so we should not do this here by [*]
> #
> # /etc/slurm/cgroup.conf: ConstrainRAMSpace=yes
> #
> # [*] https://bugs.schedmd.com/show_bug.cgi?id=5262
> JobAcctGatherParams=NoOverMemoryKill
>
> TaskPlugin=task/cgroup
>
> JobAcctGatherType=jobacct_gather/cgroup
>
>
> --
> Dr. Manuel Holtgrewe, Dipl.-Inform.
> Bioinformatician
> Core Unit Bioinformatics – CUBI
> Berlin Institute of Health / Max Delbrück Center for Molecular Medicine in
> the Helmholtz Association / Charité – Universitätsmedizin Berlin
>
> Visiting Address: Invalidenstr. 80, 3rd Floor, Room 03 028, 10117 Berlin
> Postal Address: Chariteplatz 1, 10117 Berlin
>
> E-Mail: manuel.holtgrewe at bihealth.de
> Phone: +49 30 450 543 607
> Fax: +49 30 450 7 543 901
> Web: cubi.bihealth.org  www.bihealth.org  www.mdc-berlin.de
> www.charite.de
> ------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Analabha Roy <hariseldon99 at gmail.com>
> *Sent:* Wednesday, February 1, 2023 6:12:40 PM
> *To:* slurm-users at lists.schedmd.com
> *Subject:* [ext] [slurm-users] Enforce gpu usage limits (with GRES?)
>
> Hi,
>
> I'm new to slurm, so I apologize in advance if my question seems basic.
>
> I just purchased a single node 'cluster' consisting of one 64-core cpu and
> an nvidia rtx5k gpu (Turing architecture, I think). The vendor supplied it
> with ubuntu 20.04 and slurm-wlm 19.05.5. Now I'm trying to adjust the
> config to suit the needs of my department.
>
> I'm trying to bone up on GRES scheduling by reading this manual page
> <https://slurm.schedmd.com/gres.html>, but am confused about some things.
>
> My slurm.conf file has the following lines put in it by the vendor:
>
> ###################
> # COMPUTE NODES
> GresTypes=gpu
> NodeName=shavak-DIT400TR-55L CPUs=64 SocketsPerBoard=2 CoresPerSocket=32
> ThreadsPerCore=1 RealMemory=95311 Gres=gpu:1
> #PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
>
> PartitionName=CPU Nodes=ALL Default=Yes MaxTime=INFINITE  State=UP
>
> PartitionName=GPU Nodes=ALL Default=NO MaxTime=INFINITE  State=UP
> #####################
>
> So they created two partitions that are essentially identical. Secondly,
> they put just the following line in gres.conf:
>
> ###################
> NodeName=shavak-DIT400TR-55L      Name=gpu        File=/dev/nvidia0
> ###################
>
> That's all. However, this configuration does not appear to constrain
> anyone in any manner. As a regular user, I can still use srun or sbatch to
> start GPU jobs from the "CPU partition," and nvidia-smi says that a simple
> cupy <https://cupy.dev/> script that multiplies matrices and starts as an
> sbatch job in the CPU partition can access the gpu just fine. Note that the
> environment variable "CUDA_VISIBLE_DEVICES" does not appear to be set in
> any job step. I tested this by starting an interactive srun shell in both
> CPU and GPU partition and running ''echo $CUDA_VISIBLE_DEVICES" and got
> bupkis for both.
>
>
> What I need to do is constrain jobs to using chunks of GPU Cores/RAM so
> that multiple jobs can share the GPU.
>
> As I understand from the gres manpage, simply adding "AutoDetect=nvml"
> (NVML should be installed with the NVIDIA HPC SDK, right? I installed it
> with apt-get...) in gres.conf should allow Slurm to detect the GPU's
> internal specifications automatically. Is that all, or do I need to config
> an mps GRES as well? Will that succeed in jailing out the GPU from jobs
> that don't mention any gres parameters (perhaps by setting
> CUDA_VISIBLE_DEVICES), or is there any additional config for that? Do I
> really need that extra "GPU" partition that the vendor put in for any of
> this, or is there a way to bind GRES resources to a particular partition in
> such a way that simply launching jobs in that partition will be enough?
>
> Thanks for your attention.
> Regards
> AR
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> Analabha Roy
> Assistant Professor
> Department of Physics
> <http://www.buruniv.ac.in/academics/department/physics>
> The University of Burdwan <http://www.buruniv.ac.in/>
> Golapbag Campus, Barddhaman 713104
> West Bengal, India
> Emails: daneel at utexas.edu, aroy at phys.buruniv.ac.in, hariseldon99 at gmail.com
> Webpage: http://www.ph.utexas.edu/~daneel/
>

-- 
Analabha Roy
Assistant Professor
Department of Physics
<http://www.buruniv.ac.in/academics/department/physics>
The University of Burdwan <http://www.buruniv.ac.in/>
Golapbag Campus, Barddhaman 713104
West Bengal, India
Emails: daneel at utexas.edu, aroy at phys.buruniv.ac.in, hariseldon99 at gmail.com
Webpage: http://www.ph.utexas.edu/~daneel/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230202/b1de833c/attachment.htm>