[slurm-users] Drain a single user's jobs

mercan ahmet.mercan at uhem.itu.edu.tr
Wed Apr 1 13:31:02 UTC 2020


Hi;

If you have working job_submit.lua script, you can put a block new jobs 
of the spesific user:

if job_desc.user_name == "baduser" then
                 return 2045
end

thats all!

Regards;

Ahmet M.


1.04.2020 16:22 tarihinde Mark Dixon yazdı:
> Hi David,
>
> Thanks for this, it sounds like I've not been trying crazy methods - 
> but they don't work for me:
>
> - "sacctmgr modify user foo set qos=drain" did set up the association
>   ("sacctmgr show associations" showed that QoS changed from "normal" to
>   "drain"), but this is when foo's jobs refused to start because of 
> reason
>   "InvalidQOS".
>
> - "sacctmgr update user foo set maxsubmitjobs=0" was ignored because qos
>   were already set on the partitions.
>
> But... good news!
>
> We hadn't used GrpSubmitJobs in any of our qos, so "sacctmgr modify 
> user foo set GrpSubmitJobs=0" isn't overridden anywhere, and the 
> effect is exactly what I wanted - thanks!
>
> But if anyone knows why my attempt at using a "drain" qos stopped 
> foo's previously submitted jobs from running, I'd be very interested 
> to hear about it.
>
> Thanks again,
>
> Mark
>
> On Wed, 1 Apr 2020, David Rhey wrote:
>
>> Hi Mark,
>>
>> I *think* you might need to update the user account to have access to 
>> that
>> QoS (as part of their association). Using sacctmgr modify user <foo> 
>> + some
>> additional args (they escape me at the moment).
>>
>> Also, you *might* have been able to set the MaxSubmitJobs at their 
>> account
>> level to 0 and have them run without having to do the QoS approach - but
>> that's just a guess on my end based on how we've done some things 
>> here. We
>> had a "free period" for our clusters and once it was over we set the
>> GrpSubmit jobs on an account to 0 which allowed in-flight jobs to 
>> continue
>> but no new work to be submitted.
>>
>> HTH,
>>
>> David
>>
>> On Wed, Apr 1, 2020 at 5:57 AM Mark Dixon <mark.c.dixon at durham.ac.uk> 
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm a slurm newbie who has inherited a working slurm 16.05.10 cluster.
>>>
>>> I'd like to stop user foo from submitting new jobs but allow their
>>> existing jobs to run.
>>>
>>> We have several partitions, each with its own qos and MaxSubmitJobs
>>> typically set to some vaue. These qos are stopping a "sacctmgr 
>>> update user
>>> foo set maxsubmitjobs=0" from doing anything useful, as per the
>>> documentation.
>>>
>>> I've tried setting up a competing qos:
>>>
>>>    sacctmgr add qos drain
>>>    sacctmgr modify qos drain set MaxSubmitJobs=0
>>>    sacctmgr modify qos drain set flags=OverPartQOS
>>>    sacctmgr modify user foo set qos=drain
>>>
>>> This has successfully prevented the user from submitting new jobs, but
>>> their existing jobs aren't running. I'm seeing the reason code
>>> "InvalidQOS".
>>>
>>> Any ideas what I should be looking at, please?
>>>
>>> Thanks,
>>>
>>> Mark
>>>
>>>
>>
>> -- 
>> David Rhey
>> ---------------
>> Advanced Research Computing - Technology Services
>> University of Michigan
>>
>



More information about the slurm-users mailing list