After upgrading to version 23.11.3 we started to get slammed with the following log messages from slurmctld
"error: validate_group: Could not find group with gid <id>"
This spans a handful of groups and repeats constantly, drowning out just about everything else. Attempting to do a lookup on the group shows that they exist on the scheduler node, same for all the submission and compute nodes. As far as I can tell, slurm should be able to locate the group in question.
Jobs submitted from users within those groups go through just fine. They get scheduled, run, and clean up no problem. I'm at a loss on where to look next.