Hello.
I'm going to use Slurm's cloud feature in private cloud.
The problem is that the scale out/in of the instance is not simultaneous in my cloud.
This means that if there is a scale out/in trigger, no other work is done until the trigger is completed.
If so, the Suspend/Resume generated later must be started only when the previous work is completed, but the timeout is not known accurately.
Is there any way to limit Suspend/Resume request in Slurm?
As far as I know, there is a Suspend/ResumeRate, but this only limits the number of nodes per minute and does not limit concurrency.
|