[slurm-users] Using hyperthreaded processors

Jean-Christophe HAESSIG haessigj at igbmc.fr
Thu Nov 5 08:19:43 UTC 2020


Le mercredi 04 novembre 2020 à 21:41 +0000, Sebastian T Smith a écrit :
> Hi,
Hi,

> We have Hyper-threading/SMT enabled on our cluster.  It's challenging
> to fully utilize threads, as Brian suggests.  We have a few workloads
> that benefit from it being enabled

Our cluster services tasks in the field of biology and it turns out
that they can spend a lot of time doing memory accesses and I/O. CPUs
are therefore rarely used at 100% through the duration of a job. I feel
using HT instead of overallocation+context switching could be slightly
better. Anyway, I'd have a hard time benchmarking each and every
program used on the cluster. However as Brian explains, some tasks are
clearly negatively impacted by HT (thinking of OpenMPI) so it should be
possible to bypass it on a job basis.

> We use SelectTypeParameters=CR_Core_Memory.  This configuration means
> that each thread counts as a CPU.

Assuming you declare Sockets/Cores/Threads as they physically exist on
the machines, I found out that even if a thread counts as a CPU, Slurm
allocates processing units by core, that is, usually by blocks of 2
CPUS. (Quoting the FAQ : "Note that even on systems with hyperthreading
enabled, the resources will generally be allocated to jobs at the level
of a core [...]. Two different jobs will not share a core except
through the use of a partition OverSubscribe configuration parameter.
For example, a job requesting resources for three tasks on a node with
ThreadsPerCore=2 will be allocated two full cores. Note that Slurm
commands contain a multitude of options to control resource allocation
with respect to base boards, sockets, cores and threads.")

>   The biggest issue with this has been user support.

The most confusing part for our users about this is, as I explained
above, they get handed 2 CPUs when they asked to launch a job with
ntasks=1. Worse is that since Slurm thinks it has to give 2 CPUS, when
used in cojunction with --mem-per-cpu, twice the amount of memory gets
allocated. This effectively wastes it, and even some times prevents a
job to start because not enough memory is available.

Did you find a way to deal with this ?

Thanks,
J.C. Haessig



More information about the slurm-users mailing list