I *think* you could define virtual nodes, one per socket. Then you could have a policy to prefer the least loaded node. I suspect this would rarely be a useful approach, particularly if it is not commonly used by the community (so help will be hard to find).
Gareth
Get Outlook for Androidhttps://aka.ms/ghei36 ________________________________ From: Juergen Salk via slurm-users slurm-users@lists.schedmd.com Sent: Saturday, June 8, 2024 8:36:25 AM To: Alan Stange stange@rentec.com Cc: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [slurm-users] Re: cpu distribution question
Hi Alan,
unfortunately, process placement in Slurm is kind of black magic for sub-node jobs, i.e. jobs that allocate only a small number of CPUs of a node.
I have recently raised a similar question here:
https://support.schedmd.com/show_bug.cgi?id=19236
And the buttom line was, that to "really have control over task placement you really have to allocate the node in --exclusive manner".
Best regards Jürgen
* Alan Stange via slurm-users slurm-users@lists.schedmd.com [240607 14:52]:
All,
I have a very simple slurm cluster. It's just a single system with 2 sockets and 16 cores in each socket. I would like to be able to submit a simple task into this cluster, and to have the cpus assigned to that task allocated round robin across the two sockets. Everything I try is putting all the cpus for this single task on the same socket.
I have not specified any CpuBind options in the slurm.conf file. For example, if I try
$ srun -c 4 --pty bash
I get a shell prompt on the system, and can run
$ taskset -cp $$ pid 12345 current affinity list: 0,2,4,6
and I get this same set of cpus no matter what options I try (the cluster is idle with no tasks consuming slots).
I've tried various srun command line options like: --hint=compute_bound --hint=memory_bound various --cpubind options -B 2:2 -m block:cyclic and block:fcyclic
Note that if I try to allocation 17 cpus, then I do get the 17th cpu allocated on the 2nd socket.
What magic incantation is needed to get an allocation where the cpus are chosen round robin across the sockets?
Thank you!
Alan
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
-- Jürgen Salk Scientific Software & Compute Services (SSCS) Kommunikations- und Informationszentrum (kiz) Universität Ulm Telefon: +49 (0)731 50-22478 Telefax: +49 (0)731 50-22471