[slurm-users] Defining an empty partition
Tina Friedrich
tina.friedrich at it.ox.ac.uk
Fri Dec 18 12:45:26 UTC 2020
Yeah, I had that problem as well (trying to set up a partition that
didn't have any nodes - they're not here yet).
I figured that one can have partitions with nodes that don't exist,
though. As in, not even in DNS.
I currently have this:
[arc-slurm ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
short up 12:00:00 1 down* arc-c023
short up 12:00:00 1 alloc arc-c001
short up 12:00:00 43 idle arc-c[002-022,024-045]
medium up 2-00:00:00 0 n/a
long* up infinite 0 n/a
with medium & long partition containing nodes 'arc-c[046-297]':
PartitionName=medium
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=12:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0
Hidden=NO
MaxNodes=UNLIMITED MaxTime=2-00:00:00 MinNodes=0 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=arc-c[046-297]...
which don't exist as of today:
[arc-slurm ~]$ host arc-c046
Host arc-c046 not found: 3(NXDOMAIN)
which - as you can see - simply ends up with SLURM showing the partition
with no nodes.
So you could just put a dummy nodename in the slurm.conf file?
Tina
On 18/12/2020 11:13, Steve Brasier wrote:
> Having tried just not even defining any partitions you hit this this
> <https://github.com/SchedMD/slurm/blob/master/src/common/node_conf.c#L383>check
> which seems to ensure you can't create a cluster with no nodes. Is it
> possible to create a control node without any compute nodes, e.g. as
> part of a staged deployment?
>
> http://stackhpc.com/ <http://stackhpc.com/>
> Please note I work Tuesday to Friday.
>
>
> On Fri, 18 Dec 2020 at 10:56, Steve Brasier <steveb at stackhpc.com
> <mailto:steveb at stackhpc.com>> wrote:
>
> Hi all,
>
> According to the relevant manpage
> <https://slurm.schedmd.com/archive/slurm-20.02.5/slurm.conf.html>
> it's possible to define an empty partition using "Nodes= ".
>
> However this doesn't seem to work (slurm 20.2.05):
>
> [centos at testohpc-login-0 ~]$ grep -n Partition /etc/slurm/slurm.conf
> 72:PriorityWeightPartition=1000
> 105:PartitionName=compute Default=YES MaxTime=86400 State=UP Nodes=
>
> (note there is a space after that final "=" but I've tried both
> with and without)
>
> [centos at testohpc-login-0 ~]$ sinfo
> sinfo: error: Parse error in file /etc/slurm/slurm.conf line 105:
> " Nodes= "
> sinfo: fatal: Unable to process configuration file
>
> Is this a bug, or am I doing it wrong?
>
> thanks for any suggestions
>
> Steve
>
> http://stackhpc.com/ <http://stackhpc.com/>
> Please note I work Tuesday to Friday.
>
More information about the slurm-users
mailing list