[slurm-users] Drain a single user's jobs
Mark Dixon
mark.c.dixon at durham.ac.uk
Wed Apr 1 13:22:08 UTC 2020
Hi David,
Thanks for this, it sounds like I've not been trying crazy methods - but
they don't work for me:
- "sacctmgr modify user foo set qos=drain" did set up the association
("sacctmgr show associations" showed that QoS changed from "normal" to
"drain"), but this is when foo's jobs refused to start because of reason
"InvalidQOS".
- "sacctmgr update user foo set maxsubmitjobs=0" was ignored because qos
were already set on the partitions.
But... good news!
We hadn't used GrpSubmitJobs in any of our qos, so "sacctmgr modify user
foo set GrpSubmitJobs=0" isn't overridden anywhere, and the effect is
exactly what I wanted - thanks!
But if anyone knows why my attempt at using a "drain" qos stopped foo's
previously submitted jobs from running, I'd be very interested to hear
about it.
Thanks again,
Mark
On Wed, 1 Apr 2020, David Rhey wrote:
> Hi Mark,
>
> I *think* you might need to update the user account to have access to that
> QoS (as part of their association). Using sacctmgr modify user <foo> + some
> additional args (they escape me at the moment).
>
> Also, you *might* have been able to set the MaxSubmitJobs at their account
> level to 0 and have them run without having to do the QoS approach - but
> that's just a guess on my end based on how we've done some things here. We
> had a "free period" for our clusters and once it was over we set the
> GrpSubmit jobs on an account to 0 which allowed in-flight jobs to continue
> but no new work to be submitted.
>
> HTH,
>
> David
>
> On Wed, Apr 1, 2020 at 5:57 AM Mark Dixon <mark.c.dixon at durham.ac.uk> wrote:
>
>> Hi all,
>>
>> I'm a slurm newbie who has inherited a working slurm 16.05.10 cluster.
>>
>> I'd like to stop user foo from submitting new jobs but allow their
>> existing jobs to run.
>>
>> We have several partitions, each with its own qos and MaxSubmitJobs
>> typically set to some vaue. These qos are stopping a "sacctmgr update user
>> foo set maxsubmitjobs=0" from doing anything useful, as per the
>> documentation.
>>
>> I've tried setting up a competing qos:
>>
>> sacctmgr add qos drain
>> sacctmgr modify qos drain set MaxSubmitJobs=0
>> sacctmgr modify qos drain set flags=OverPartQOS
>> sacctmgr modify user foo set qos=drain
>>
>> This has successfully prevented the user from submitting new jobs, but
>> their existing jobs aren't running. I'm seeing the reason code
>> "InvalidQOS".
>>
>> Any ideas what I should be looking at, please?
>>
>> Thanks,
>>
>> Mark
>>
>>
>
> --
> David Rhey
> ---------------
> Advanced Research Computing - Technology Services
> University of Michigan
>
More information about the slurm-users
mailing list