After using just Fairshare for over a year on our GPU cluster, we have decided it is not working for us for what we really want to achieve among our groups. We have decided to look at preemption.
What we want is for users to NOT have a #job/GPU maximum (if they are only person on the cluster they should be able to use it all), but if another user comes to the "full" cluster they should immediately be able to run some jobs. Thus preemption is needed.
In our scheme we want
* users to have N protected GPU jobs that cannot be preempted where N is the number of GPUs allocated.
* N may not be the same for all users. Some priviledged users get more.
* jobs pending in the queue will have lower priority dependent on the number of GPUs allocated to running jobs. Maybe doable somehow with PriorityWeightJobSize though not sure how.
* Jobs over N are subject to preemption (and requeued if --requeue is given) with shortest running jobs of the user with most unprotected GPUs preempted first.
* another complication is we have a variety of different GPUs and users may ask for specific ones which can limit what unprotected GPU jobs are available for preemption
My first attempt to do this in SLURM was to just create two partitions, GPU and GPU-req, with different PriorityTier values and the later partition have PreemptMode=REQUEUE. But N would be set by a MaxTRES on the first partition and be the same for everyone and we need it to be INDEPENDENT for each user.
Also users would have to "think" about which partition to submit jobs to. And users want their longest running "unprotected" job to be able to be PROMOTED to a "protected" jobs automatically when a "protected" job finishes. However slurm does not allow running jobs to move between partitions.
I am trying to figure out QOS pre-emption which might solve the independent N per user issue but I don't think it will solve the promotion issue.
Any ideas how this scheme might be possible in SLURM?
Otherwise I might have to write a complicated cron job that tries to do it all "outside" of SLURM issuing scontrol commands.
--------------------------------------------------------------- Paul Raines http://help.nmr.mgh.harvard.edu MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging 149 (2301) 13th Street Charlestown, MA 02129 USA
The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at https://www.massgeneralbrigham.org/complianceline https://www.massgeneralbrigham.org/complianceline . Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.