[slurm-users] Drain a single user's jobs

Mark Dixon mark.c.dixon at durham.ac.uk
Wed Apr 1 14:27:21 UTC 2020


Hi Ahmet,

Another way to do it! Many thanks - very useful :)

But does anyone know why the a user association with my qos stopped jobs 
running with InvalidQOS?

I can imagine using a user qos to override a partition qos being useful 
for other things, so would be nice to know what I've done wrong.

Best,

Mark

On Wed, 1 Apr 2020, mercan wrote:

> Hi;
>
> If you have working job_submit.lua script, you can put a block new jobs of 
> the spesific user:
>
> if job_desc.user_name == "baduser" then
>                 return 2045
> end
>
> thats all!
>
> Regards;
>
> Ahmet M.
>
>
> 1.04.2020 16:22 tarihinde Mark Dixon yazdı:
>>  Hi David,
>>
>>  Thanks for this, it sounds like I've not been trying crazy methods - but
>>  they don't work for me:
>>
>>  - "sacctmgr modify user foo set qos=drain" did set up the association
>>    ("sacctmgr show associations" showed that QoS changed from "normal" to
>>    "drain"), but this is when foo's jobs refused to start because of reason
>>    "InvalidQOS".
>>
>>  - "sacctmgr update user foo set maxsubmitjobs=0" was ignored because qos
>>    were already set on the partitions.
>>
>>  But... good news!
>>
>>  We hadn't used GrpSubmitJobs in any of our qos, so "sacctmgr modify user
>>  foo set GrpSubmitJobs=0" isn't overridden anywhere, and the effect is
>>  exactly what I wanted - thanks!
>>
>>  But if anyone knows why my attempt at using a "drain" qos stopped foo's
>>  previously submitted jobs from running, I'd be very interested to hear
>>  about it.
>>
>>  Thanks again,
>>
>>  Mark
>>
>>  On Wed, 1 Apr 2020, David Rhey wrote:
>>
>>>  Hi Mark,
>>>
>>>  I *think* you might need to update the user account to have access to
>>>  that
>>>  QoS (as part of their association). Using sacctmgr modify user <foo> +
>>>  some
>>>  additional args (they escape me at the moment).
>>>
>>>  Also, you *might* have been able to set the MaxSubmitJobs at their
>>>  account
>>>  level to 0 and have them run without having to do the QoS approach - but
>>>  that's just a guess on my end based on how we've done some things here.
>>>  We
>>>  had a "free period" for our clusters and once it was over we set the
>>>  GrpSubmit jobs on an account to 0 which allowed in-flight jobs to
>>>  continue
>>>  but no new work to be submitted.
>>>
>>>  HTH,
>>>
>>>  David
>>>
>>>  On Wed, Apr 1, 2020 at 5:57 AM Mark Dixon <mark.c.dixon at durham.ac.uk>
>>>  wrote:
>>>
>>>>  Hi all,
>>>>
>>>>  I'm a slurm newbie who has inherited a working slurm 16.05.10 cluster.
>>>>
>>>>  I'd like to stop user foo from submitting new jobs but allow their
>>>>  existing jobs to run.
>>>>
>>>>  We have several partitions, each with its own qos and MaxSubmitJobs
>>>>  typically set to some vaue. These qos are stopping a "sacctmgr update
>>>>  user
>>>>  foo set maxsubmitjobs=0" from doing anything useful, as per the
>>>>  documentation.
>>>>
>>>>  I've tried setting up a competing qos:
>>>>
>>>>     sacctmgr add qos drain
>>>>     sacctmgr modify qos drain set MaxSubmitJobs=0
>>>>     sacctmgr modify qos drain set flags=OverPartQOS
>>>>     sacctmgr modify user foo set qos=drain
>>>>
>>>>  This has successfully prevented the user from submitting new jobs, but
>>>>  their existing jobs aren't running. I'm seeing the reason code
>>>>  "InvalidQOS".
>>>>
>>>>  Any ideas what I should be looking at, please?
>>>>
>>>>  Thanks,
>>>>
>>>>  Mark
>>>> 
>>>> 
>>>
>>>  --
>>>  David Rhey
>>>  ---------------
>>>  Advanced Research Computing - Technology Services
>>>  University of Michigan
>>> 
>> 
>

-- 
Mark Dixon <mark.c.dixon at durham.ac.uk> Tel: +44(0)191 33 41383
Advanced Research Computing (ARC), Durham University, UK


More information about the slurm-users mailing list