[slurm-users] Distribute a single node resources across multiple partitons
Sam Gallop (NBI)
sam.gallop at nbi.ac.uk
Fri Jul 7 11:34:47 UTC 2023
Something might be possible but it's a bit of a kludge. To do this cgroups and ConstrainCores needs to be configured.
Say you have a node called tux that has 16 cores and 512GB, and you want to split it into two logical nodes of 8 cores and 256GB.
In slurm.conf add the NodeNames as you want them (in this case tux01 and tux02) but point the NodeAddr to the hostname or IP of the actual host. Divide up the resources as you wish. Note, the CPUSpecList is used to reserve cores for system use but we can use it to mask the cores we would like to access on the other logical node. Also note, the documentation does say that the use of the Port option "is not generally recommended except for development or testing purposes".
NodeName=tux01 NodeAddr=tux Port=6001 CPUs=16 SocketsPerBoard=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=262144 CPUSpecList=0-7
NodeName=tux02 NodeAddr=tux Port=6002 CPUs=16 SocketsPerBoard=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=262144 CPUSpecList=8-15
Then add the nodes to the partitions.
PartitionName=ppart Nodes=tux01 ...
PartitionName=cpart Nodes=tux02 ...
You'll then need to run two slurmd services per node and use the '-N' option to run the daemon with the given hostname, for example 'slurmd -N tux01'.
Like I say, it's a bit of a kludge.
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Purvesh Parmar
Sent: Thursday, July 6, 2023 1:21 PM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Distribute a single node resources across multiple partitons
Do I need separate slurmctld and slurmd to run for this? I am struggling for this. Any pointers.
On Mon, 26 Jun 2023 at 12:15, Purvesh Parmar <purveshp0507 at gmail.com<mailto:purveshp0507 at gmail.com>> wrote:
I have slurm 20.11 in a cluster of 4 nodes, with each node having 16 cpus. I want to create two partitions (ppart and cpart) and want that 8 cores from each of the 4 nodes should be part of part of ppart and remaining 8 cores should be part of cpart, this means, I want to distribute each node's resources across multiple partitions exclusively. How to go about this?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users