[slurm-users] Topology configuration questions:

Prentice Bisbal pbisbal at pppl.gov
Thu Jan 17 21:52:46 UTC 2019


And a follow-up question: Does topology.conf need to be on all the 
nodes, or just the slurm controller? It's not clear from that web page. 
I would assume only the controller needs it.

Prentice

On 1/17/19 4:49 PM, Prentice Bisbal wrote:
> From https://slurm.schedmd.com/topology.html:
>
>> Note that compute nodes on switches that lack a common parent switch 
>> can be used, but no job will span leaf switches without a common 
>> parent (unless the TopologyParam=TopoOptional option is used). For 
>> example, it is legal to remove the line "SwitchName=s4 
>> Switches=s[0-3]" from the above topology.conf file. In that case, no 
>> job will span more than four compute nodes on any single leaf switch. 
>> This configuration can be useful if one wants to schedule multiple 
>> phyisical clusters as a single logical cluster under the control of a 
>> single slurmctld daemon.
>
> My current environment falls into the category of multiple physical 
> clusters being treated as a single logical cluster under the control 
> of a single slurmctld daemon. At least, that's my goal.
>
> In my environment, I have 2 "clusters" connected by their own separate 
> IB fabrics, and one "cluster" connected with 10 GbE. I have a fourth 
> cluster connected with only 1GbE. For this 4th cluster, we don't want 
> jobs to span nodes, due to the slow performance of 1 GbE. (This 
> cluster is intended for serial and low-core count parallel jobs) If I 
> just leave those nodes out of the topology.conf file, will that have 
> the desired affect of not allocating multi-node jobs to those nodes, 
> or will it result in an error of some sort?
>



More information about the slurm-users mailing list