[slurm-users] enable_configless, srun and DNS vs. hosts file
    Mark Dixon 
    mark.c.dixon at durham.ac.uk
       
    Wed Nov 10 15:13:30 UTC 2021
    
    
  
Hi,
I'm using the "enable_configless" mode to avoid the need for a shared 
slurm.conf file, and am having similar trouble to others when running 
"srun", e.g.
   srun: error: fwd_tree_thread: can't find address for host cn120, check slurm.conf
   srun: error: Task launch for StepId=113.0 failed on node cn120: Can't find an address, check slurm.conf
   srun: error: Application launch failed: Can't find an address, check slurm.conf
   srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
I understand that the accepted solution is to add the nodenames to DNS. Is 
that really correct?
I ask because it would be a great help if slurm instead used the more 
usual mechanism and consult the sources listed in /etc/nsswitch.conf. We 
use a large /etc/hosts file instead of DNS for our cluster and would 
rather not start running named if we can help it.
Thanks,
Mark
PS Adding a line like "NodeName=cn[001-999]" to the submit/compute host
    slurm.conf file makes this go away (I hope skipping the node detail, or
    adding nodes that don't exist [yet] won't cause other problems).
    
    
More information about the slurm-users
mailing list