[slurm-users] after upgrade to 23.11.1 nodes stuck in completion state
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue Jan 30 09:04:16 UTC 2024
On 1/30/24 09:36, Fokke Dijkstra wrote:
> We had similar issues with Slurm 23.11.1 (and 23.11.2). Jobs get stuck in
> a completing state and slurmd daemons can't be killed because they are
> left in a CLOSE-WAIT state. See my previous mail to the mailing list for
> the details. And also https://bugs.schedmd.com/show_bug.cgi?id=18561
> <https://bugs.schedmd.com/show_bug.cgi?id=18561> for another site having
> issues.
Bug 18561 was submitted by a user with no support contract, so it's
unlikely that SchedMD will look into it.
I guess many sites are considering the upgrade to 23.11, and if there is
an issue as reported, a site with a valid support contract needs to open a
support case. I'm very interested in hearing about any progress with 23.11!
Thanks,
Ole
More information about the slurm-users
mailing list