thomas.hartmann--- via slurm-users wrote:
My idea was to basically have three partitions:
- PartitionName=short MaxTime=04:00:00 State=UP Nodes=node[01-99] PriorityTier=100
- PartitionName=long_safe MaxTime=14-00:00:00 State=UP Nodes=node[01-50] PriorityTier=100
- PartitionName=long_preempt MaxTime=14-00:00:00 State=UP Nodes=nodes[01-99] PriorityTier=40 PreemptMode=requeue
I don't know why you consider preemption if you have short jobs, just wait for jobs to finish.
My first approach would be to have two partitions, both of them containing all nodes, but diffent QoSes assigned to them, so you can limit the short jobs to a certain amount of cpus and also limit long jobs to a certain amount of cpus - maybe 80% for each of them.
Gerhard