We have an HPC where the average job length is measured in days, not hours. Users are careful to add checkpoints to their jobs but even in that case, preempting a job that is close to its walltime (max: 14 days) can be very disruptive. I checked what options preemption offers but none seem to protect jobs near their finishing line. PreemptExempTime ensures a minimum job runtime and GraceTime allows for a grace time period after the job has been selected for preemption. Is there anything I am missing to achieve what I want?
Thank you!
As far as I know, GraceTime is the one most close to what you are seeking, which applies to "anytime", not only at the end, but may be good enough.
The alternative is to not preempt these long jobs, do checkpointing more often or to "hack your way out". What I mean for the latter is that most settings for jobs can be changed by root. I haven't checked for preemption settings, but assuming they can be changed, you can put on a daily-cron or at-job which makes jobs non-preemptable when they are x hours (or days) close to their wallclock time.
On Thu, Feb 5, 2026 at 12:41 AM Irene Azaceta via slurm-users < slurm-users@lists.schedmd.com> wrote:
We have an HPC where the average job length is measured in days, not hours. Users are careful to add checkpoints to their jobs but even in that case, preempting a job that is close to its walltime (max: 14 days) can be very disruptive. I checked what options preemption offers but none seem to protect jobs near their finishing line. PreemptExempTime ensures a minimum job runtime and GraceTime allows for a grace time period after the job has been selected for preemption. Is there anything I am missing to achieve what I want?
Thank you!
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com