[slurm-users] Upgrade from 17.02.11 to 21.08.2 and state information

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Fri Feb 4 14:27:31 UTC 2022


On 04-02-2022 08:59, Bjørn-Helge Mevik wrote:
> Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
> 
>> As Brian Andrus said, you must upgrade Slurm by at most 2 major
>> versions, and that includes slurmd's as well!  Don't do a "direct
>> upgrade" of slurmd by more than 2 versions!
> 
> That should only be an issue if you have running jobs during the
> upgrade, shouldn't it?  As I understand it, without any running jobs,
> you can do pretty much what you want on the compute nodes.  Or am I
> missing something here?

I think that Slurm's communication protocol is incompatible when 
versions differ by more than 2.  So the slurmd daemons may possibly lose 
contact with the slurmctld in that case.

In my experience, it's not a problem to upgrade slurmd while the nodes 
are running jobs: Upgrade the slurmd RPM, and slurmd will restart itself 
and attach to the running jobs.  There are probably cases where this 
will cause job crashes, so please heed the information collected in the 
Wiki page 
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-on-centos-7
There may be some issues with MPI applications as mentioned in the Wiki.

/Ole



More information about the slurm-users mailing list