[slurm-users] disable-bindings disables counting of gres resources
steinbac at mpi-cbg.de
Mon Apr 15 15:15:34 UTC 2019
thanks for following up on this thread.
> First of all, you will want to use cgroups to ensure that processes that do
> not request GPUs cannot access them.
We had a feeling that cgroups might be more optimal. Could you point us
to documentation that suggests cgroups to be a requirement?
> Secondly, do your CPUs have hyperthreading enabled by some chance?
> If so then your gres.conf is likely wrong as you'll want to list the first HT
> on each core that you want to restrict access to.
No HT involved here at any point, neither on our cluster nor within the
dockerized slurm installation I was playing with.
> From the manual page for gres.conf:
> NOTE: If your cores contain multiple threads only list the first thread
> of each core. The logic is such that it uses core instead of thread
> scheduling per GRES. Also note that since Slurm must be able to perform
> resource management on heterogeneous clusters having various core ID num-
> bering schemes, an abstract index will be used instead of the physical
> core index. That abstract id may not correspond to your physical core
> number. Basically Slurm starts numbering from 0 to n, being 0 the id of
> the first processing unit (core or thread if HT is enabled) on the first
> socket, first core and maybe first thread, and then continuing sequen-
> tially to the next thread, core, and socket. The numbering generally
> coincides with the processing unit logical number (PU L#) seen in lstopo
We are aware of this section of the manpage. thanks.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5253 bytes
Desc: S/MIME Cryptographic Signature
More information about the slurm-users