[slurm-users] Upgrade from 17.02.11 to 21.08.2 and state information
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Fri Feb 4 14:27:31 UTC 2022
On 04-02-2022 08:59, Bjørn-Helge Mevik wrote:
> Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
>
>> As Brian Andrus said, you must upgrade Slurm by at most 2 major
>> versions, and that includes slurmd's as well! Don't do a "direct
>> upgrade" of slurmd by more than 2 versions!
>
> That should only be an issue if you have running jobs during the
> upgrade, shouldn't it? As I understand it, without any running jobs,
> you can do pretty much what you want on the compute nodes. Or am I
> missing something here?
I think that Slurm's communication protocol is incompatible when
versions differ by more than 2. So the slurmd daemons may possibly lose
contact with the slurmctld in that case.
In my experience, it's not a problem to upgrade slurmd while the nodes
are running jobs: Upgrade the slurmd RPM, and slurmd will restart itself
and attach to the running jobs. There are probably cases where this
will cause job crashes, so please heed the information collected in the
Wiki page
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-on-centos-7
There may be some issues with MPI applications as mentioned in the Wiki.
/Ole
More information about the slurm-users
mailing list