[slurm-users] Meaning of --cpus-per-task and --mem-per-cpu when SMT processors are used
alexander.grund at tu-dresden.de
Wed Mar 4 12:25:40 UTC 2020
> What is your hardware configuration? Do you have 1 server with 44
processor sockets, and each processor has 4 CPU cores? Or is it maybe 1
server with 1 or more sockets for a total of 44 CPU cores, and each CPU
core is running 4 hyperthreads?
1 server, 2 sockets, 22 cores each, 4 hyperthreads --> 2*22*4=176
"CPUTot" as reported by "scontrol show node"
> I think you should give the relevant node and partition lines from
I found the following in node.conf: NodeName=taurusml[1-32] Feature=IB
Gres=gpu:6 Procs=176 Sockets=2 CoresPerSocket=22 ThreadsPerCore=4
RealMemory=254000 State=UNKNOWN Weight=128
> Which Slurm version do you run?
> The whypending tool does not appear in a google search. Where did you
get it from and what does it do?
It seems to be a Python script showing why a job is pending. It uses
pyslurm. I thought it was a slurm tool, but might be some custom thing
> >Most importantly: Does this mean `--cpus-per-task` can be as high as
176 on this node and `--mem-per-cpu` can be up to the reported
> This is just historical as far as I can tell. I think 'CPU' almost
always means 'core'.
I just tried a very simple example with 1 task and `--cpus-per-task=50`
(slightly higher than the 44 physical cores) and it failed with
"Requested node configuration is not available"
So in summary: "CPU" for the srun/sbatch/salloc means "(physical) core".
"CPU" as for scontrol (and pyslurm which seems to wrap this) means
"Thread". This is confusing but at least the question seems to be
Interdisziplinäre Anwendungsunterstützung und Koordination (IAK)
Technische Universität Dresden
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
Würzburger Str.35/Chemnitzer Str.50, Raum 010 01062 Dresden
Tel.: +49 (351) 463-35982
E-Mail: alexander.grund at tu-dresden.de
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5204 bytes
Desc: S/MIME Cryptographic Signature
More information about the slurm-users