You need to find the node which the job started on.
Then look at the slurmd log on that node. You may find an indication of the reason for the failure.

On Tue, 7 Jan 2025 at 11:30, sportlecon sportlecon via slurm-users <slurm-users@lists.schedmd.com> wrote:
slurm 24.11 - squeue displays  reason "launch failed requeued held"

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com