[slurm-users] Moving Slurmctld and slurmdbd to a new host

Michael Gutteridge michael.gutteridge at gmail.com
Sat Jan 16 18:43:39 UTC 2021


I'd confirm that as well.  The state directory has all of that
information.  We just upgraded from 18.05 to 20.02 on a different host and
while the cluster was quiet (we had a maintenance reservation in place)
there were running jobs which survived the upgrade.

I think the big thing to watch out for is setting the slurmdtimeout in your
config prior to the update.  Might not be necessary depending on the exact
steps you're using, but it's useful insurance against job loss.

HTH

 - Michael


On Fri, Jan 15, 2021 at 7:51 PM Ryan Novosielski <novosirj at rutgers.edu>
wrote:

> My understanding is job state directory. Theoretically if you back it up,
> screw up and lose it, you can restore it and try again. There’s some
> mention of this in the upgrade docs if I’m not mistaken (as they suggest
> backing it up in case you mess up during).
>
> --
> #BlackLivesMatter
> ____
> || \\UTGERS,
> |---------------------------*O*---------------------------
> ||_// the State     |         Ryan Novosielski - novosirj at rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\    of NJ     | Office of Advanced Research Computing - MSB C630,
> Newark
>     `'
>
> On Jan 15, 2021, at 13:44, Prentice Bisbal <pbisbal at pppl.gov> wrote:
>
> Slurm users,
>
> I'm planning on moving slurmctld and slurmdbd to a new host. I know how to
> dump the MySQL DB from the old server and import it to the new slurmdbd
> host, and I know how to copy the job state directories to the new host. I
> plan on doing this during our next maintenance window when there are no
> jobs running on the cluster.
>
> However, there will be plenty of jobs in the queue, so my question is
> this: What will happen to jobs in the queue when I do this? Is the queue
> information stored in the database or the job state directories, or a third
> location? How can I make sure I don't lose the state of the queue?
>
> --
> Prentice
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210116/b3a05463/attachment.htm>


More information about the slurm-users mailing list