[slurm-users] priority access and QoS

Jason Simms jsimms1 at swarthmore.edu
Mon Feb 27 19:28:25 UTC 2023


Hello all,

I haven't found any guidance that seems to be the current "better
practice," but this does seem to be a common use case. I imagine there are
multiple ways to accomplish this goal. For example, you could assuredly do
it with QoS, but you can likely also accomplish this with some other
weighting scheme based on, e.g., account. At my last position, I
accomplished this by having a partition containing the purchased nodes that
permitted a specific account only, which also had a PriorityTier setting,
and ensuring the cluster was configured to preempt based on a partition's
priority setting. So, even if the same nodes were in a different partition,
if a user in the account requested resources, it would preempt (if needed)
jobs from users not in that account. These are sample configuration lines
to illustrate (obviously simplified):

PreemptType=preempt/partition_prio
PreemptMode=REQUEUE

PartitionName=node PriorityTier=50 Nodes=node[01-06]
PartitionName=smithlab AllowAccounts=smithlab PriorityTier=100 Nodes=node06

I never heard from a user that this failed to preempt when necessary, so I
presume it works as advertised (in this case, if a user from smithlab ran a
job on node06, it would preempt non-smithlab users if the requested
resources were unavailable). Note that the user needs to specify the
smithlab account in, e.g., the batch submission file or on the command
line, especially if they have a non-smithlab account with the same username.

If someone can recommend why this approach isn't advisable, or if there is
a preferred approach, I would welcome feedback.

Warmest regards,
Jason


On Mon, Feb 27, 2023 at 2:09 PM Styrk, Daryl <Daryl.Styrk at ucsf.edu> wrote:

> Marko,
>
>
>
> I’m in a similar situation. We have many Accounts with dedicated hardware
> and recently ran into a situation where a user with dedicated submitted
> hundreds of jobs and they overflowed into the community hardware which
> caused an unexpected backlog. I believe QoS will help us with that as well.
> I’ve been researching and reading about best practices.
>
>
>
> Regards,
>
> Daryl
>
>
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Marko Markoc <mmarkoc at pdx.edu>
> *Date: *Wednesday, February 22, 2023 at 1:56 PM
> *To: *slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
> *Subject: *[slurm-users] priority access and QoS
>
> Hi All, Currently in our environment we only have default one "free" tier
> of access to our resources and we are looking to add additional higher
> priority tier access. That means that the jobs from the users that
> "purchased"
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender *
>
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Hi All,
>
>
>
> Currently in our environment we only have default one "free" tier of
> access to our resources and we are looking to add additional higher
> priority tier access. That means that the jobs from the users that
> "purchased" a certain amount of service units will preempt jobs of the
> users in the free tier. I was thinking of using slurm QoS to achieve this
> by adding users/groups via sacctmgr to this newly created QoS tier but I
> wanted to check with all of you if there is a better way to
> accomplish this through slurm. Also, could GrpTRESMins be used to
> automatically keep track of SU usage by a certain user or group or is there
> some better usage tracking mechanism ?
>
>
>
> Thank You all,
>
> Marko
>


-- 
*Jason L. Simms, Ph.D., M.P.H.*
Manager of Research Computing
Swarthmore College
Information Technology Services
(610) 328-8102
Schedule a meeting: https://calendly.com/jlsimms
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230227/e9a736a5/attachment.htm>


More information about the slurm-users mailing list