[slurm-users] Topology configuration questions:
Prentice Bisbal
pbisbal at pppl.gov
Thu Jan 17 21:49:08 UTC 2019
From https://slurm.schedmd.com/topology.html:
> Note that compute nodes on switches that lack a common parent switch
> can be used, but no job will span leaf switches without a common
> parent (unless the TopologyParam=TopoOptional option is used). For
> example, it is legal to remove the line "SwitchName=s4
> Switches=s[0-3]" from the above topology.conf file. In that case, no
> job will span more than four compute nodes on any single leaf switch.
> This configuration can be useful if one wants to schedule multiple
> phyisical clusters as a single logical cluster under the control of a
> single slurmctld daemon.
My current environment falls into the category of multiple physical
clusters being treated as a single logical cluster under the control of
a single slurmctld daemon. At least, that's my goal.
In my environment, I have 2 "clusters" connected by their own separate
IB fabrics, and one "cluster" connected with 10 GbE. I have a fourth
cluster connected with only 1GbE. For this 4th cluster, we don't want
jobs to span nodes, due to the slow performance of 1 GbE. (This cluster
is intended for serial and low-core count parallel jobs) If I just leave
those nodes out of the topology.conf file, will that have the desired
affect of not allocating multi-node jobs to those nodes, or will it
result in an error of some sort?
--
Prentice
More information about the slurm-users
mailing list