[slurm-users] Meaning of --cpus-per-task and --mem-per-cpu when SMT processors are used
Alexander Grund
alexander.grund at tu-dresden.de
Wed Mar 4 12:25:40 UTC 2020
> What is your hardware configuration? Do you have 1 server with 44
processor sockets, and each processor has 4 CPU cores? Or is it maybe 1
server with 1 or more sockets for a total of 44 CPU cores, and each CPU
core is running 4 hyperthreads?
1 server, 2 sockets, 22 cores each, 4 hyperthreads --> 2*22*4=176
"CPUTot" as reported by "scontrol show node"
> I think you should give the relevant node and partition lines from
your slurm.conf.
I found the following in node.conf: NodeName=taurusml[1-32] Feature=IB
Gres=gpu:6 Procs=176 Sockets=2 CoresPerSocket=22 ThreadsPerCore=4
RealMemory=254000 State=UNKNOWN Weight=128
> Which Slurm version do you run?
19.05.5
> The whypending tool does not appear in a google search. Where did you
get it from and what does it do?
It seems to be a Python script showing why a job is pending. It uses
pyslurm. I thought it was a slurm tool, but might be some custom thing
> >Most importantly: Does this mean `--cpus-per-task` can be as high as
176 on this node and `--mem-per-cpu` can be up to the reported
"RealMemory"/176?
> Yes.
> This is just historical as far as I can tell. I think 'CPU' almost
always means 'core'.
I just tried a very simple example with 1 task and `--cpus-per-task=50`
(slightly higher than the 44 physical cores) and it failed with
"Requested node configuration is not available"
So in summary: "CPU" for the srun/sbatch/salloc means "(physical) core".
"CPU" as for scontrol (and pyslurm which seems to wrap this) means
"Thread". This is confusing but at least the question seems to be
answered now.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Alexander Grund
Interdisziplinäre Anwendungsunterstützung und Koordination (IAK)
Technische Universität Dresden
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
Würzburger Str.35/Chemnitzer Str.50, Raum 010 01062 Dresden
Tel.: +49 (351) 463-35982
E-Mail: alexander.grund at tu-dresden.de
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5204 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200304/5f0271bb/attachment.bin>
More information about the slurm-users
mailing list