[slurm-users] Meaning of --cpus-per-task and --mem-per-cpu when SMT processors are used

Alexander Grund alexander.grund at tu-dresden.de
Wed Mar 4 12:25:40 UTC 2020

 > What is your hardware configuration?  Do you have 1 server with 44 
processor sockets, and each processor has 4 CPU cores?  Or is it maybe 1 
server with 1 or more sockets for a total of 44 CPU cores, and each CPU 
core is running 4 hyperthreads?

1 server, 2 sockets, 22 cores each, 4 hyperthreads --> 2*22*4=176 
"CPUTot" as reported by "scontrol show node"

 > I think you should give the relevant node and partition lines from 
your slurm.conf.

I found the following in node.conf: NodeName=taurusml[1-32] Feature=IB 
Gres=gpu:6 Procs=176 Sockets=2 CoresPerSocket=22 ThreadsPerCore=4 
RealMemory=254000 State=UNKNOWN Weight=128

 > Which Slurm version do you run?


 > The whypending tool does not appear in a google search. Where did you 
get it from and what does it do?

It seems to be a Python script showing why a job is pending. It uses 
pyslurm. I thought it was a slurm tool, but might be some custom thing

 > >Most importantly: Does this mean `--cpus-per-task` can be as high as 
176 on this node and `--mem-per-cpu` can be up to the reported 
 > Yes.

 > This is just historical as far as I can tell. I think 'CPU' almost 
always means 'core'.

I just tried a very simple example with 1 task and `--cpus-per-task=50` 
(slightly higher than the 44 physical cores) and it failed with 
"Requested node configuration is not available"

So in summary: "CPU" for the srun/sbatch/salloc means "(physical) core". 
"CPU" as for scontrol (and pyslurm which seems to wrap this) means 
"Thread". This is confusing but at least the question seems to be 
answered now.

Alexander Grund
Interdisziplinäre Anwendungsunterstützung und Koordination (IAK)

Technische Universität Dresden
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
Würzburger Str.35/Chemnitzer Str.50, Raum 010 01062 Dresden
Tel.: +49 (351) 463-35982
E-Mail: alexander.grund at tu-dresden.de

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5204 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200304/5f0271bb/attachment.bin>

More information about the slurm-users mailing list