[slurm-users] 20.11.8: Altered federation code ? "siblings not synced yet" messages

Kevin Buckley Kevin.Buckley at pawsey.org.au
Mon Jul 5 03:39:14 UTC 2021


Upgrade our Cray TDS from 20.11.7 to 20.11.8, without making any
changes to the configuration but am not now seeing job start to
run, whilst seeing messages in the slurmd log akin to these four

  Submitted federated JobId=67122494 to tdsname(self)
  _slurm_rpc_submit_batch_job: JobId=67122494 InitPrio=0 usec=8208
  sched: schedule() returning, federation siblings not synced yet
  sched/backfill: _attempt_backfill: returning, federation siblings not synced yet


none of which were in evidence prior to the upgrade.

Didn't see anything in the 20.11.8 changes that suggested anything
to do with "federation" had been introduced, though yet to trawl
through the code.

Anyone seen similar?

Kevin
-- 
Supercomputing Systems Administrator
Pawsey Supercomputing Centre



More information about the slurm-users mailing list