[slurm-users] Upgrading SLURM from 18 to 20.11.9

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Thu Sep 8 18:39:36 UTC 2022


Paul is right!  You may upgrade 18.08 to 20.02, but not 20.11!

Some details on upgrading Slurm is in this Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm

/Ole

On 08-09-2022 18:26, Paul Edmon wrote:
> But not any 20.  There are 20 versions, 20.02 and 20.11, and there was a 
> previous 19.05.  So two versions for 18.08 would be 20.02 not 20.11
> 
> 
> -Paul Edmon-
> 
> 
> On 9/8/22 12:14 PM, Wadud Miah wrote:
>> The previous version was 18 and now I am trying to upgrade to 20, so I 
>> am well within 2 major versions.
>>
>> Regards,
>> ------------------------------------------------------------------------
>> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf 
>> of Paul Edmon <pedmon at cfa.harvard.edu>
>> *Sent:* Thursday, September 8, 2022 4:44:36 PM
>> *To:* slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
>> *Subject:* Re: [slurm-users] Upgrading SLURM from 18 to 20.11.9
>> *CAUTION:* This e-mail originated outside the University of Southampton.
>>
>> Typically slurm only supports upgrading between 2 major versions 
>> ahead.  If you are on 18.08 you likely can only go to 20.02. Then 
>> after you upgrade to 20.02 you can go to 20.11 or 21.08.
>>
>>
>> -Paul Edmon-
>>
>>
>> On 9/8/22 11:38 AM, Wadud Miah wrote:
>>> hi Mick,
>>>
>>> I have checked that all the compute nodes and controllers all have 
>>> the same version of SLURM (20.11.9). I am indeed trying to upgrade 
>>> SlurmDB first, and am getting the errors in the slurmdbd.log:
>>>
>>> [2022-09-08T15:45:11.115] slurmdbd version 20.11.9 started
>>> [2022-09-08T15:45:23.001] error: unpack_header: protocol_version 8448 
>>> not supported
>>> [2022-09-08T15:33:57.001] unpacking header
>>> [2022-09-08T15:33:57.001] error: destroy_forward: no init
>>> [2022-09-08T15:33:57.001] error: slurm_unpack_received_msg: Message 
>>> receive failure
>>> [2022-09-08T15:33:57.011] error: CONN:11 Failed to unpack 
>>> SLURM_PERSIST_INIT message
>>>
>>> Regards,
>>> Wadud.
>>>
>>> ------------------------------------------------------------------------
>>> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> 
>>> <mailto:slurm-users-bounces at lists.schedmd.com> on behalf of Timony, 
>>> Mick <Michael_Timony at hms.harvard.edu> 
>>> <mailto:Michael_Timony at hms.harvard.edu>
>>> *Sent:* 08 September 2022 16:24
>>> *To:* Slurm User Community List <slurm-users at lists.schedmd.com> 
>>> <mailto:slurm-users at lists.schedmd.com>
>>> *Subject:* Re: [slurm-users] Upgrading SLURM from 18 to 20.11.9
>>> *CAUTION:* This e-mail originated outside the University of Southampton.
>>> This thread on the forums may help:
>>>
>>> https://groups.google.com/g/slurm-users/c/YB55Ru9rvD4 
>>> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgroups.google.com%2Fg%2Fslurm-users%2Fc%2FYB55Ru9rvD4&data=05%7C01%7Cw.miah%40soton.ac.uk%7Cfd25248a7e6a4fa729d308da91b20c1a%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637982491141437024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=THo3JUObIzF6EWcIlQ1OsJwUwxEAGUFeMdLuvlEKhzA%3D&reserved=0>
>>>
>>>
>>> It looks like you have something on your network with an older 
>>> version of slurm installed. I'd check the Slurm version installed on 
>>> your compute nodes and controllers.
>>>
>>> The recommended approach to upgrading is to upgrade the SlurmDB 
>>> first, then the controllers, then the compute nodes. More info here:
>>>
>>> https://slurm.schedmd.com/quickstart_admin.html#upgrade 
>>> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fquickstart_admin.html%23upgrade&data=05%7C01%7Cw.miah%40soton.ac.uk%7Cfd25248a7e6a4fa729d308da91b20c1a%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637982491141437024%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Evu1PdyvAeinb0W11Ia6NxUgOvfaITmJiVau8nak%2Fac%3D&reserved=0>
>>>
>>> Regards
>>> -- 
>>> Mick Timony
>>> Senior DevOps Engineer
>>> Harvard Medical School
>>> --
>>>
>>> ------------------------------------------------------------------------
>>> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> 
>>> <mailto:slurm-users-bounces at lists.schedmd.com> on behalf of Wadud 
>>> Miah <W.Miah at soton.ac.uk> <mailto:W.Miah at soton.ac.uk>
>>> *Sent:* Thursday, September 8, 2022 10:47 AM
>>> *To:* slurm-users at lists.schedmd.com 
>>> <mailto:slurm-users at lists.schedmd.com> 
>>> <slurm-users at lists.schedmd.com> <mailto:slurm-users at lists.schedmd.com>
>>> *Subject:* [slurm-users] Upgrading SLURM from 18 to 20.11.9
>>> Hi,
>>>
>>> I am attempting to upgrade from SLURM 18 to 20.11.9 and when I 
>>> attempt to start slurmdbd (version 20.11.9), I get the following 
>>> error messages in /var/log/slurm/slurmdbd.log:
>>>
>>> [2022-09-08T15:45:11.115] slurmdbd version 20.11.9 started
>>> [2022-09-08T15:45:23.001] error: unpack_header: protocol_version 8448 
>>> not supported
>>> [2022-09-08T15:33:57.001] unpacking header
>>> [2022-09-08T15:33:57.001] error: destroy_forward: no init
>>> [2022-09-08T15:33:57.001] error: slurm_unpack_received_msg: Message 
>>> receive failure
>>> [2022-09-08T15:33:57.011] error: CONN:11 Failed to unpack 
>>> SLURM_PERSIST_INIT message
>>>
>>> Any help will be greatly appreciated.
>>>
>>> Regards,
>>>
>>> ----------
>>> Wadud Miah
>>> Research Computing Support
>>> University of Southampton




More information about the slurm-users mailing list