[slurm-users] GRES Restrictions

Thu Apr 15 07:59:27 UTC 2021

Hello,

is there a best practise for activating this feature (set 
ConstrainDevices=yes)? Do I have restart the slurmds? Does this affects running 
jobs?

We are using Slurm 19.05.

Best,
Stefan

Am Dienstag, 25. August 2020, 17:24:41 CEST schrieb Christoph Brüning:
> Hello,
> 
> we're using cgroups to restrict access to the GPUs.
> 
> What I found particularly helpful, are the slides by Marshall Garey from
> last year's Slurm User Group Meeting:
> https://slurm.schedmd.com/SLUG19/cgroups_and_pam_slurm_adopt.pdf
> (NVML didn't work for us for some reason I cannot recall, but listing
> the GPU device files explicitly was not a big deal)
> 
> Best,
> Christoph
> 
> On 25/08/2020 16.12, Willy Markuske wrote:
> > Hello,
> > 
> > I'm trying to restrict access to gpu resources on a cluster I maintain
> > for a research group. There are two nodes put into a partition with gres
> > gpu resources defined. User can access these resources by submitting
> > their job under the gpu partition and defining a gres=gpu.
> > 
> > When a user includes the flag --gres=gpu:# they are allocated the number
> > of gpus and slurm properly allocates them. If a user requests only 1 gpu
> > they only see CUDA_VISIBLE_DEVICES=1. However, if a user does not
> > include the --gres=gpu:# flag they can still submit a job to the
> > partition and are then able to see all the GPUs. This has led to some
> > bad actors running jobs on all GPUs that other users have allocated and
> > causing OOM errors on the gpus.
> > 
> > Is it possible, and where would I find the documentation on doing so, to
> > require users to define a --gres=gpu:# to be able to submit to a
> > partition? So far reading the gres documentation doesn't seem to have
> > yielded any word on this issue specifically.
> > 
> > Regards,

-- 
Stefan Stäglich,  Universität Freiburg,  Institut für Informatik
Georges-Köhler-Allee,  Geb.52,   79110 Freiburg,    Germany

E-Mail : staeglis at informatik.uni-freiburg.de
WWW    : gki.informatik.uni-freiburg.de
Telefon: +49 761 203-54216
Fax    : +49 761 203-8222