On Sep 26, 2024, at 15:03, Ward Poelmans via slurm-users <slurm-users@lists.schedmd.com> wrote:

Hi Bjørn-Helge,

On 26/09/2024 09:50, Bjørn-Helge Mevik via slurm-users wrote:
Ward Poelmans via slurm-users <slurm-users@lists.schedmd.com> writes:
We hit a snag when updating our clusters from Slurm 23.02 to
24.05. After updating the slurmdbd, our multi cluster setup was broken
until everything was updated to 24.05. We had not anticipated this.
When you say "everything", do you mean all the slurmctlds, or also all slurmds?

Yes, the issue was gone after *everything* was upgraded: the slurmctld, slurmd and login nodes.

Ward, apologies for reopening your ticket and marking it sev 1 (which apparently is possible!), but their response to this is unsatisfactory. I can understand not wanting to change the code if they made such a large mistake and it's hard to reverse. However, fixing the upgrade guide is something that should be done in the next hour, let alone two days later. They could be generating production outages right now at sites that are following their directions that promise compatibility.

Thank you for saving those of us that read this list from that major headache!

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'