[slurm-users] Drain a single user's jobs
mercan
ahmet.mercan at uhem.itu.edu.tr
Wed Apr 1 13:31:02 UTC 2020
Hi;
If you have working job_submit.lua script, you can put a block new jobs
of the spesific user:
if job_desc.user_name == "baduser" then
return 2045
end
thats all!
Regards;
Ahmet M.
1.04.2020 16:22 tarihinde Mark Dixon yazdı:
> Hi David,
>
> Thanks for this, it sounds like I've not been trying crazy methods -
> but they don't work for me:
>
> - "sacctmgr modify user foo set qos=drain" did set up the association
> ("sacctmgr show associations" showed that QoS changed from "normal" to
> "drain"), but this is when foo's jobs refused to start because of
> reason
> "InvalidQOS".
>
> - "sacctmgr update user foo set maxsubmitjobs=0" was ignored because qos
> were already set on the partitions.
>
> But... good news!
>
> We hadn't used GrpSubmitJobs in any of our qos, so "sacctmgr modify
> user foo set GrpSubmitJobs=0" isn't overridden anywhere, and the
> effect is exactly what I wanted - thanks!
>
> But if anyone knows why my attempt at using a "drain" qos stopped
> foo's previously submitted jobs from running, I'd be very interested
> to hear about it.
>
> Thanks again,
>
> Mark
>
> On Wed, 1 Apr 2020, David Rhey wrote:
>
>> Hi Mark,
>>
>> I *think* you might need to update the user account to have access to
>> that
>> QoS (as part of their association). Using sacctmgr modify user <foo>
>> + some
>> additional args (they escape me at the moment).
>>
>> Also, you *might* have been able to set the MaxSubmitJobs at their
>> account
>> level to 0 and have them run without having to do the QoS approach - but
>> that's just a guess on my end based on how we've done some things
>> here. We
>> had a "free period" for our clusters and once it was over we set the
>> GrpSubmit jobs on an account to 0 which allowed in-flight jobs to
>> continue
>> but no new work to be submitted.
>>
>> HTH,
>>
>> David
>>
>> On Wed, Apr 1, 2020 at 5:57 AM Mark Dixon <mark.c.dixon at durham.ac.uk>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm a slurm newbie who has inherited a working slurm 16.05.10 cluster.
>>>
>>> I'd like to stop user foo from submitting new jobs but allow their
>>> existing jobs to run.
>>>
>>> We have several partitions, each with its own qos and MaxSubmitJobs
>>> typically set to some vaue. These qos are stopping a "sacctmgr
>>> update user
>>> foo set maxsubmitjobs=0" from doing anything useful, as per the
>>> documentation.
>>>
>>> I've tried setting up a competing qos:
>>>
>>> sacctmgr add qos drain
>>> sacctmgr modify qos drain set MaxSubmitJobs=0
>>> sacctmgr modify qos drain set flags=OverPartQOS
>>> sacctmgr modify user foo set qos=drain
>>>
>>> This has successfully prevented the user from submitting new jobs, but
>>> their existing jobs aren't running. I'm seeing the reason code
>>> "InvalidQOS".
>>>
>>> Any ideas what I should be looking at, please?
>>>
>>> Thanks,
>>>
>>> Mark
>>>
>>>
>>
>> --
>> David Rhey
>> ---------------
>> Advanced Research Computing - Technology Services
>> University of Michigan
>>
>
More information about the slurm-users
mailing list