[slurm-users] Meaning of --cpus-per-task and --mem-per-cpu when SMT processors are used

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Wed Mar 4 09:31:00 UTC 2020


On 3/4/20 10:12 AM, Alexander Grund wrote:
> we have a Power9 partition with 44 processors having 4 cores each totaling 
> 176.

What is your hardware configuration?  Do you have 1 server with 44 
processor sockets, and each processor has 4 CPU cores?  Or is it maybe 1 
server with 1 or more sockets for a total of 44 CPU cores, and each CPU 
core is running 4 hyperthreads?

I think you should give the relevant node and partition lines from your 
slurm.conf.

Which Slurm version do you run?

> `scontrol show node <node>` shows "CoresPerSocket=22" and "CPUTot=176" 
> which confuses me. Especially as `whypending` reports e.g. "172 cores free: 1"

The whypending tool does not appear in a google search.  Where did you get 
it from and what does it do?

> So what are "CPUs" and what are "Cores" to SLURM? Why does it mix up those 2?
> 
> Most importantly: Does this mean `--cpus-per-task` can be as high as 176 
> on this node and `--mem-per-cpu` can be up to the reported "RealMemory"/176?

Perhaps this page will be of use to you:
https://slurm.schedmd.com/cpu_management.html

/Ole



More information about the slurm-users mailing list