[slurm-users] Allow certain users to run over partition limit

Sebastian T Smith stsmith at unr.edu
Wed Jul 8 19:27:52 UTC 2020

Re-reading your question, I don't think reservations are applicable to your problem.

Reservations are used to isolate resources for an amount of time for a group of users -- not the job scheduling policies tied to the resources.  You can allocate a number of dedicated resources for a long duration, say 10 nodes for 60 days, but the partition/assoc/QOS/etc limits will still apply to these resources -- your 24hr runtime limit would still apply.

I think of reservations as a lightweight, virtual partition.

Our max runtime is 14 days (via QOS).  This is a major difference between our systems.  We use reservations to allocate resources from our shared pool for very long durations, but we haven't needed to adjust the max job runtime because it's been sufficient.  This confused my answer.  Sorry!




[University of Nevada, Reno]<http://www.unr.edu/>
Sebastian Smith
High-Performance Computing Engineer
Office of Information Technology
1664 North Virginia Street
MS 0291

work-phone: 775-682-5050<tel:7756825050>
email: stsmith at unr.edu<mailto:stsmith at unr.edu>
website: http://rc.unr.edu<http://rc.unr.edu/>

From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Matthew BETTINGER <matthew.bettinger at external.total.com>
Sent: Wednesday, July 8, 2020 10:53 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Allow certain users to run over partition limit

Ok I see the resource hierarchy limits :

Partition QOS limit
Job QOS limit
User association
Account association(s), ascending the hierarchy
Root/Cluster association
Partition limit

Where in this list does the reservations fall under?  Do reservations override all of these if they are set to exceed resources imposed by the partition configuration?  Thanks!

On 7/7/20, 6:02 PM, "slurm-users on behalf of Sebastian T Smith" <slurm-users-bounces at lists.schedmd.com on behalf of stsmith at unr.edu> wrote:


    We use Job QOS and Resource Reservations for this purpose.  QOS is a good option for a "permanent" change to a user's resource limits.  We use reservations similar to how you're currently using partitions to "temporarily" provide a resource boost without the complexities of re-partitioning or mucking with associations.

    Precedence  in resource limits: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fresource_limits&data=01%7C01%7Cstsmith%40unr.edu%7Cd78b383953814945112908d823680dff%7C523b4bfc0ebd4c03b2b96f6a17fd31d8%7C1&sdata=2zkNSAHfRPnral7FbGtC9Q68YyYzhSaaLMONOCFAv1E%3D&reserved=0.

    From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Matthew BETTINGER <matthew.bettinger at external.total.com>
    Sent: Tuesday, July 7, 2020 9:40 AM
    To: slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
    Subject: [slurm-users] Allow certain users to run over partition limit


    We have a slurm system with partitions set for max runtime of 24hours.  What would be the proper way to allow a certain set of users to run jobs on the current partitions beyond the partition limits?  In the past we would isolate some nodes based on their job requirements , make a new partition and a reservation with the users and have to push out the new configuration.  This is pretty unwieldy but works but doing it this way the nodes are basically wasted unless they are not being used by these special users and unavailable for others.

    Is there some way we can allow some users sometimes to run over partition run time more easily than manually modifying slurm.conf.  Possibly with qos?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200708/00f3a505/attachment-0001.htm>

More information about the slurm-users mailing list