[slurm-users] Drain a single user's jobs

Mark Dixon mark.c.dixon at durham.ac.uk
Wed Apr 1 13:22:08 UTC 2020


Hi David,

Thanks for this, it sounds like I've not been trying crazy methods - but 
they don't work for me:

- "sacctmgr modify user foo set qos=drain" did set up the association
   ("sacctmgr show associations" showed that QoS changed from "normal" to
   "drain"), but this is when foo's jobs refused to start because of reason
   "InvalidQOS".

- "sacctmgr update user foo set maxsubmitjobs=0" was ignored because qos
   were already set on the partitions.

But... good news!

We hadn't used GrpSubmitJobs in any of our qos, so "sacctmgr modify user 
foo set GrpSubmitJobs=0" isn't overridden anywhere, and the effect is 
exactly what I wanted - thanks!

But if anyone knows why my attempt at using a "drain" qos stopped foo's 
previously submitted jobs from running, I'd be very interested to hear 
about it.

Thanks again,

Mark

On Wed, 1 Apr 2020, David Rhey wrote:

> Hi Mark,
>
> I *think* you might need to update the user account to have access to that
> QoS (as part of their association). Using sacctmgr modify user <foo> + some
> additional args (they escape me at the moment).
>
> Also, you *might* have been able to set the MaxSubmitJobs at their account
> level to 0 and have them run without having to do the QoS approach - but
> that's just a guess on my end based on how we've done some things here. We
> had a "free period" for our clusters and once it was over we set the
> GrpSubmit jobs on an account to 0 which allowed in-flight jobs to continue
> but no new work to be submitted.
>
> HTH,
>
> David
>
> On Wed, Apr 1, 2020 at 5:57 AM Mark Dixon <mark.c.dixon at durham.ac.uk> wrote:
>
>> Hi all,
>>
>> I'm a slurm newbie who has inherited a working slurm 16.05.10 cluster.
>>
>> I'd like to stop user foo from submitting new jobs but allow their
>> existing jobs to run.
>>
>> We have several partitions, each with its own qos and MaxSubmitJobs
>> typically set to some vaue. These qos are stopping a "sacctmgr update user
>> foo set maxsubmitjobs=0" from doing anything useful, as per the
>> documentation.
>>
>> I've tried setting up a competing qos:
>>
>>    sacctmgr add qos drain
>>    sacctmgr modify qos drain set MaxSubmitJobs=0
>>    sacctmgr modify qos drain set flags=OverPartQOS
>>    sacctmgr modify user foo set qos=drain
>>
>> This has successfully prevented the user from submitting new jobs, but
>> their existing jobs aren't running. I'm seeing the reason code
>> "InvalidQOS".
>>
>> Any ideas what I should be looking at, please?
>>
>> Thanks,
>>
>> Mark
>>
>>
>
> -- 
> David Rhey
> ---------------
> Advanced Research Computing - Technology Services
> University of Michigan
>



More information about the slurm-users mailing list