[slurm-users] Verifying preemption WON'T happen
Groner, Rob
rug262 at psu.edu
Fri Sep 29 16:02:02 UTC 2023
On our system, for some partitions, we guarantee that a job can run at least an hour before being preempted by a higher priority job. We use the QOS preempt exempt time for this, and it appears to be working. But of course, I want to TEST that it works.
So on a test system, I start a lower priority job on a specific node, wait until it starts running, and then I start a higher priority job for the same node. The test should only pass if the higher priority job has an OPPORTUNITY to preempt the lower priority job, and doesn't.
Now, I know I can get a preempt eligible time out of scontrol for the lower priority job and verify that it's set for an hour (I do check that already), but that's not good enough for me. I could obviously let the test run for an hour to verify the lower priority job was never preempted...but that's not really feasible. So instead, I want to verify that the higher priority job has had a chance to preempt the lower priority job, and it did not.
So far, the way I've been doing that is to check the reported Scheduler in the scontrol job output for the higher priority job. I figure that when the scheduler changes to Backfill instead of Main, then the higher priority job has been seen by the main scheduler and it passed on the chance to preempt the lower priority job.
Is that a good assumption? Is there any other, or potentially quicker, way to verify that the higher priority job will NOT preempt the lower priority job?
Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230929/7d475873/attachment.htm>
More information about the slurm-users
mailing list