[slurm-users] Switch setting in slurm.conf breaks slurmctld if the switch type is not there in slurmcrld node

Richard Chang rchang.lists at gmail.com
Fri Oct 28 06:30:05 UTC 2022


Yes, the system is a HPE Cray EX, and I am trying to use 
switch/hpe_slingshot.

RC


On 10/28/2022 11:21 AM, Ole Holm Nielsen wrote:
> On 10/28/22 07:35, Richard Chang wrote:
>> I have observed that when I specify a switch type in the slurm.conf 
>> file and that particular switch type is not present in the slurmctld 
>> node, slurmctld panics and shuts down. Is this expected ? My 
>> slurmctld doesn't have the switch type, but the computes have that 
>> switch type. how can I set it up so that it can utilise the feature 
>> but not break slurm.
>
> What is you line in slurm.conf?  The manual page seems to describe 
> what you have observed:
>
> SwitchType
>               Identifies the type of switch or interconnect used for 
> applica‐
>               tion      communications.      Acceptable     values 
> include
>               "switch/cray_aries" for Cray systems, "switch/none" for 
> switches
>               not  requiring  special processing for job launch or 
> termination
>               (Ethernet,  and   InfiniBand)   and   The default   
> value   is
>               "switch/none".   All  Slurm  daemons,  commands and 
> running jobs
>               must be restarted for a change in SwitchType to take 
> effect.  If
>               running jobs exist at the time slurmctld is restarted 
> with a new
>               value of SwitchType, records of all jobs in  any state 
> may  be
>               lost.
>
> Why do you want to use this configuration?  Is your system a Cray?
>
> /Ole
>



More information about the slurm-users mailing list