[slurm-users] after upgrade to 23.11.1 nodes stuck in completion state

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Tue Jan 30 09:04:16 UTC 2024


On 1/30/24 09:36, Fokke Dijkstra wrote:
> We had similar issues with Slurm 23.11.1 (and 23.11.2). Jobs get stuck in 
> a completing state and slurmd daemons can't be killed because they are 
> left in a CLOSE-WAIT state. See my previous mail to the mailing list for 
> the details. And also https://bugs.schedmd.com/show_bug.cgi?id=18561 
> <https://bugs.schedmd.com/show_bug.cgi?id=18561> for another site having 
> issues.

Bug 18561 was submitted by a user with no support contract, so it's 
unlikely that SchedMD will look into it.

I guess many sites are considering the upgrade to 23.11, and if there is 
an issue as reported, a site with a valid support contract needs to open a 
support case.  I'm very interested in hearing about any progress with 23.11!

Thanks,
Ole



More information about the slurm-users mailing list