[slurm-users] Partition question

Ransom, Geoffrey M. Geoffrey.Ransom at jhuapl.edu
Thu Dec 19 15:44:48 UTC 2019



         The simplest is probably to just have a separate partition that will only allow job times of 1 hour or less.

This is how our Univa queues used to work, by overlapping the same hardware. Univa shows available "slots" to the users and we had a lot of confused users complaining about all those free slots (busy slots in the other queue) while their jobs sat on the queue and new users confused as to why their jobs were being killed after 4 hours. I was able to move the short/long behavior to job classes and use RQSes and have one queue.

While slurm isn't showing users unused resources I am concerned that going back to two queues (partitions) will cause user interaction and adoption problems.

         It all depends on what best suits the specific needs.

Is there a way to have one partition that holds aside a small percentage of resources for jobs with a runtime under 4 hours, i.e. jobs with long runtimes cannot tie up 100% of the resources at one time? Some kind of virtual partition that feeds into two other partitions based on runtime would also work. The goal is that users can continue to post jobs to one partition but the scheduler won't let 100% of the compute resources get tied up with mutli-week long jobs.

Thanks.
On 12/16/2019 2:29 PM, Ransom, Geoffrey M. wrote:

Hello
   I am looking into switching from Univa (sge) to slurm and am figuring out how to implement some of our usage policy in slurm.

We have a Univa queue which uses job classes and RQSes to limit jobs with a run time over 4 hours to only half the available slots (CPU cores) so some slots are always free for quick jobs. We don't want all of our resources tied up with multiweek jobs when someone has a batch of 1 hour jobs to run.

Is there a way to implement this in slurm? To have a partition which will hold some CPU/GPU resources aside for jobs with a short runtime.

What would be the preferred solution for this issue in a slurm world?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191219/8c0fb142/attachment.htm>


More information about the slurm-users mailing list