[slurm-users] On the ability of coordinators

Renfro, Michael Renfro at tntech.edu
Wed May 17 18:11:49 UTC 2023

If there’s a fairshare component to job priorities, and there’s a share assigned to each user under the account, wouldn’t the light user’s jobs move ahead of any of the heavy user’s pending jobs automatically?

From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of "Groner, Rob" <rug262 at psu.edu>
Reply-To: Slurm User Community List <slurm-users at lists.schedmd.com>
Date: Wednesday, May 17, 2023 at 1:09 PM
To: "slurm-users at lists.schedmd.com" <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] On the ability of coordinators

External Email Warning

This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.

Ya, I found they had the power to hold jobs just be experimentation.  Maybe it will turn out I had something misconfigured and coordinators don't have that ability either.  I hope that's not the case, since being able to hold jobs in their account gives them some usefulness.

My interest in this was solely focused on what coordinators could do to jobs within their account.  So, I accepted as ok that a coordinator couldn't move jobs in their account to a higher priority than jobs in other accounts.  I just wanted the coordinator to be able to move jobs in their account to a higher priority over other jobs within the same account.  Being able to use hold/release seems like what we're looking for.  I just wonder why coordinators can't use "top" as well, for jobs within their coordinated account.  I guess "top" is meant to move them to the top of the entire pending queue, and in my case, I was only interested in the coordinator moving certain jobs in their accounts to the top of the account-related queue.  But of course, there ISN'T an account-related queue, so maybe that's why top doesn't work for a coordinator.  I think I just answered my own question.....

From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Brian Andrus <toomuchit at gmail.com>
Sent: Wednesday, May 17, 2023 2:00 PM
To: slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] On the ability of coordinators

Coordinator permissions from the man pages:

A special privileged user, usually an account manager, that can add users or sub-accounts to the account they are coordinator over. This should be a trusted person since they can change limits on account and user associations, as well as cancel, requeue or reassign accounts of jobs inside their realm.

So, I read that as it manages accounts in slurmdb with minimal access to the jobs themselves. So you would be stuck with cancel/requeue. I see no mention of hold, but if that is one of the permissions, I would say, yes, our approach does what you want within the limits of what the default permissions of a coordinator can do.

Of course, that still may not work if there are other accounts/partitions/users with higher priority jobs than User B. Specifically if those jobs can use the same resources A's jobs are running on.

Brian Andrus

On 5/17/2023 10:49 AM, Groner, Rob wrote:
I'm not sure what you mean by "if they have the permissions".  I'm talking about someone who is specifically designated as "coordinator" of an account in slurm.  With that designation, and no other admin level changes, I'm not aware that they can directly change the priority of jobs associated with the account.

If you're talking about additional permissions or admin levels...we're not looking into that as an option.  We want to purely use the coordinator role to have them manipulate stuff.

From: slurm-users <slurm-users-bounces at lists.schedmd.com><mailto:slurm-users-bounces at lists.schedmd.com> on behalf of Brian Andrus <toomuchit at gmail.com><mailto:toomuchit at gmail.com>
Sent: Wednesday, May 17, 2023 12:58 PM
To: slurm-users at lists.schedmd.com<mailto:slurm-users at lists.schedmd.com> <slurm-users at lists.schedmd.com><mailto:slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] On the ability of coordinators

If they have the permissions, you can just raise the priority of user B's jobs to be higher than whatever A's currently are. Then they will run next.

That will work if you are able to wait for some jobs to finish and you can 'skip the line' for the priority jobs.

If you need to preempt running jobs, that would take a bit more effort to set up, but is an alternative.

Brian Andrus

On 5/17/2023 6:40 AM, Groner, Rob wrote:
I was asked to see if coordinators could do anything in this scenario:

  *   Within the account that they coordinated, User A submitted 1000s of jobs and left for the day.
  *   Within the same account, User B wanted to run a few jobs really quickly.  Once submitted, his jobs were of course behind User A's jobs.
  *   The coordinator wanted to see the results of User B's runs.
Reading the docs and doing some experiments, here is what I determined:

  *   The coordinator could put a hold on all of User A's jobs in the pending queue.  This won't affect any jobs User A has that aren't tied to the coordinated account.
  *   With User A's jobs held, then User B's jobs would be next to run.
  *   If the coordinator was particularly impatient, he could scancel User A's currently running jobs so that User B's jobs immediately started.
  *   The coordinator would need to remember to release the held jobs, or put them in a uhold so that User A could release them eventually.
It seems like the easiest way for the coordinator to elevate User B's jobs to the top of the queue would be if he could "scontrol top" those jobs.  But my testing indicates that the coordinator doesn't have that permission.  Is there some reason that a coordinator can't use "scontrol top" to change the priority of jobs within the account that he coordinates?



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230517/c1a1a120/attachment-0001.htm>

More information about the slurm-users mailing list