[slurm-users] problem with "configless" slurm.conf

Durai Arasan arasan.durai at gmail.com
Tue Jul 20 14:01:31 UTC 2021


Hello,

We have set up "configless slurm" by passing a "conf-server" argument to
slurmd on all nodes. More details here:
https://slurm.schedmd.com/configless_slurm.html

one of the nodes is not able to pick up the configuration:


*>srun -w slurm-bm-70 --pty bash*




*srun: error: fwd_tree_thread: can't find address for host slurm-bm-70,
check slurm.confsrun: error: Task launch for 402011.0 failed on node
slurm-bm-70: Can't find an address, check slurm.confsrun: error:
Application launch failed: Can't find an address, check slurm.confsrun: Job
step aborted: Waiting up to 32 seconds for job step to finish.srun: error:
Timed out waiting for job step to complete*

This is limited to this one node only. Do you know how to fix this? I
already tried restarting the slurmd service on this node.

Thanks,
Durai
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210720/e8597bba/attachment.htm>


More information about the slurm-users mailing list