[slurm-users] Re: How do you handle GPU node failures during long jobs?