[slurm-users] MaxCPUsPerNode Clarification

Wed Jun 22 08:50:10 UTC 2022

Hello,

the solution we are currently using on our site is indeed a separate 
partition; according to your example It'd look like this:

Partition  Nodes  #CPUs Available

cpu          cpu-[01-03] 64

cpu_any   gpu-[01-02] 32 (set with MaxCPUsPerNode=32)

gpu          gpu-[01-02] 64

The trick now is to have CPU-only jobs with <cores_per_node> <= 32 set 
"--partition=cpu,cpu_any" to signal to the scheduler that they can run 
in either.
Together with node weights you can then make sure that CPU-only jobs 
will prefer to fill up the cpu-<xy> nodes first before taking cores form 
the gpu-<xy> nodes by using the cpu_any partition.

This also opens up the possibility for automatically changing 
--partition=cpu to --partition=cpu,cpu_any if <cores_per_node> <= 32 via 
job_submit.lua (a good example to use as a starting template can be 
found e.g. here: 
https://gist.github.com/mikerenfro/92d70562f9bb3f721ad1b221a1356de5 - 
although I'd be careful and test this first, as I cannot say if this is 
still applicable in unmodified form for current-day SLURM versions)

Regards,
René Sitt

Am 21.06.22 um 16:11 schrieb Willy Markuske:
>
> Hello All,
>
> I'm trying to clarify how the MaxCPUsPerNode can be configured. I'm 
> looking to enable my "cpu" partition to run on our GPU nodes while 
> ensuring there are always some cpus available for the "gpu" partition. 
> I know I can set the "cpu" partition to have a MaxCPUsPerNode less 
> than the number of available cpus on the GPU nodes to do this. 
> However, I don't also want to limit the number of cpus available on a 
> CPU node which doesn't seem possible currently because only a single 
> partition definition can be included in slurm.conf.
>
> The desired configuration would be something like this
>
> Partition  Nodes  #CPUs Available
>
> cpu          cpu-[01-03] 64
>
> cpu          gpu-[01-02] 32
>
> gpu          gpu-[01-02] 64
>
> It doesn't seem possible to set a partition to limit MaxCPUsPerNode on 
> a per node basis. Is the real solution a different partition/QOS to 
> handle this?
>
> Regards,
>
> -- 
>
> Willy Markuske
>
> HPC Systems Engineer
>
> 	
>
> Research Data Services
>
> P: (619) 519-4435
>
-- 
Dipl.-Chem. René Sitt
Hessisches Kompetenzzentrum für Hochleistungsrechnen
Philipps-Universität Marburg
Hans-Meerwein-Straße
35032 Marburg

Tel. +49 6421 28 23523
sittr at hrz.uni-marburg.de
www.hkhlr.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220622/af3535d8/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SDSClogo-plusname-red.jpg
Type: image/jpeg
Size: 9464 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220622/af3535d8/attachment-0001.jpg>