[slurm-users] after upgrade to 23.11.1 nodes stuck in completion state
Heckes, Frank
heckes at mps.mpg.de
Tue Jan 30 16:50:28 UTC 2024
These are scary news. I just updated to 23.11.1, but couldn't confirm the
problems described so far. I'll do some more extensive and intensive tests.
In case of desaster: Does anyone knows how to rollback the DB, as some new DB
'objects' attributes are introduced in 23.11.1. I never had the chance to do
this before :-0
As we have support contract I would open a ticket.
> -----Original Message-----
> From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of
> Ole Holm Nielsen
> Sent: Tuesday, 30 January 2024 10:04
> To: slurm-users at lists.schedmd.com
> Subject: Re: [slurm-users] after upgrade to 23.11.1 nodes stuck in
> completion
> state
>
> On 1/30/24 09:36, Fokke Dijkstra wrote:
> > We had similar issues with Slurm 23.11.1 (and 23.11.2). Jobs get stuck
> > in a completing state and slurmd daemons can't be killed because they
> > are left in a CLOSE-WAIT state. See my previous mail to the mailing
> > list for the details. And also
> > https://bugs.schedmd.com/show_bug.cgi?id=18561
> > <https://bugs.schedmd.com/show_bug.cgi?id=18561> for another site
> > having issues.
>
> Bug 18561 was submitted by a user with no support contract, so it's unlikely
> that SchedMD will look into it.
>
> I guess many sites are considering the upgrade to 23.11, and if there is an
> issue as reported, a site with a valid support contract needs to open a
> support case. I'm very interested in hearing about any progress with 23.11!
>
> Thanks,
> Ole
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 7331 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20240130/4dcefbd4/attachment.bin>
More information about the slurm-users
mailing list