[slurm-users] disable-bindings disables counting of gres resources

Peter Steinbach steinbac at mpi-cbg.de
Mon Apr 15 15:15:34 UTC 2019

Hi Chris,

thanks for following up on this thread.
> First of all, you will want to use cgroups to ensure that processes that do
> not request GPUs cannot access them.

We had a feeling that cgroups might be more optimal. Could you point us 
to documentation that suggests cgroups to be a requirement?

> Secondly, do your CPUs have hyperthreading enabled by some chance?
> If so then your gres.conf is likely wrong as you'll want to list the first HT
> on each core that you want to restrict access to.

No HT involved here at any point, neither on our cluster nor within the 
dockerized slurm installation I was playing with.

>  From the manual page for gres.conf:
>                NOTE: If your cores contain multiple threads only list the  first  thread
>                of  each  core.  The  logic  is  such that it uses core instead of thread
>                scheduling per GRES. Also note that since Slurm must be able  to  perform
>                resource management on heterogeneous clusters having various core ID num-
>                bering schemes, an abstract index will be used instead  of  the  physical
>                core  index.  That  abstract  id may not correspond to your physical core
>                number.  Basically Slurm starts numbering from 0 to n, being 0 the id  of
>                the  first processing unit (core or thread if HT is enabled) on the first
>                socket, first core and maybe first thread, and  then  continuing  sequen-
>                tially  to  the  next  thread,  core, and socket. The numbering generally
>                coincides with the processing unit logical number (PU L#) seen in  lstopo
>                output.

We are aware of this section of the manpage. thanks.


