[slurm-users] 21.08.6 srun fails with error "Invalid job credential" ; sbatch is fine.
Williams, Jenny Avis
jennyw at email.unc.edu
Fri May 13 21:31:03 UTC 2022
Yesterday I upgraded slurmdbd and slurmctld nodes from RHEL7 / Slurm v. 20.11.8 to RHEL8.5 / Slurm v. 21.08.6 on our production cluster.
I also updated slurm on the rhel7 login nodes to 21.08.6
Sbatch jobs run fine.
Srun, however, fails from the updated login node with invalid job credential errors. Sruns from nodes that are not update runs fine.
I am hoping this looks familiar to you.
$ srun --slurmd-debug=verbose -n 1 -t 8:00:00 --mem=3g -p interact -w c0801 --pty /bin/bash
srun: job 45281066 queued and waiting for resources
srun: job 45281066 has been allocated resources
srun: error: Task launch for StepId=45281066.0 failed on node c0801: Invalid job credential
srun: error: Application launch failed: Invalid job credential
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: Timed out waiting for job step to complete
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users