[slurm-users] How to deal with user running stuff in frontend node?

John Hanks griznog at gmail.com
Thu Feb 15 09:03:26 MST 2018


I've used this with some success:
https://github.com/JohannesBuchner/verynice. For CPU intensive things it
works great, but you have to also set some memory limits in limits.conf if
users do any large memory stuff. Otherwise I just use a problem process as
a chance to start a conversation with that user to see what they are
working on, seems to make people happy when you talk to them and try to
help rather than just killing their work and scolding them.

jbh

On Thu, Feb 15, 2018 at 7:32 AM, Pablo Escobar <pescobar001 at gmail.com>
wrote:

> Hi Manuel,
>
> A possible workaround is to configure a cgroups limit by user in the
> frontend node so a single user cannot allocate more than 1GB of ram (or
> whatever value you prefer). The user would still be able to abuse the
> machine but as soon as his memory usage goes above the limit his job will
> be killed by cgroup and this should not affect too much the users behaving
> correctly.
>
> In any case the best solution I know is a non technical one. When a user
> abuse the system we close the account. He quickly sends and email asking
> what happened and why he cannot login and we reply that as he abused the
> system we won't open the account until his boss contacts us asking to
> reopen it. After the user has to explain the "problem" to his/her boss they
> don't abuse the system again ;)
>
> regards,
> Pablo.
>
> 2018-02-15 16:11 GMT+01:00 Manuel Rodríguez Pascual <
> manuel.rodriguez.pascual at gmail.com>:
>
>> Hi all,
>>
>> Although this is not strictly related to Slurm, maybe you can recommend
>> me some actions to deal with a particular user.
>>
>> On our small cluster, currently there are no limits to run applications
>> in the frontend. This is sometimes really useful for some users, for
>> example to have scripts monitoring the execution of jobs and taking
>> decisions depending on the partial results.
>>
>> However, we have this user that keeps abusing this system: when the job
>> queue is long and there is a significant time wait, he sometimes runs his
>> jobs on the frontend, resulting on a CPU load of 100% and some delays on
>> using it for the things it is supposed to serve (user login, monitoring and
>> so).
>>
>> Have you faced the same issue?  Is there any solution? I am thinking
>> about using ulimit to limit the execution time of this jobs in the frontend
>> to 5 minutes or so. This however does not look so elegant as other users
>> can perform the sabe abuse on the future, and he should also be able to run
>> low cpu-consuming jobs for a longer period. However I am not an experienced
>> sysadmin so I am completely open to suggestions or different ways of facing
>> this issue.
>>
>> Any thoughts?
>>
>> cheers,
>>
>>
>>
>>
>> Manuel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180215/773e5e57/attachment.html>


More information about the slurm-users mailing list