Dear Slurm community,
we are running 23.02.7. For a small fraction of jobs we get "Job is no longer pending execution for job <JOBID>" when running "scontrol update job=JOBID comment=SOME-COMMENT" on an already running job.
For the majority (I guess 98%) of running jobs, this works perfectly fine. I also see no reason why updating the comment of a running job should not be allowed.
Did anyone else observe this issue or are there explanations why this might be expected?
(We abuse the comment field to track which GPUs within a node are assigned to a job as we did not find any option within Slurm to get the GPU IDs of assigned GPUs but we need that information for our monitoring framework.)
Best regards,
Thomas Zeiser, NHR@FAU