[slurm-users] [External] Re: What is an easy way to prevent users run programs on the master/login node.
Prentice Bisbal
pbisbal at pppl.gov
Tue Apr 27 15:45:49 UTC 2021
This is not a good approach. There's plenty of jobs you can run that
will hog a systems resources without using MPI. MATLAB and Mathematica
both support parallel computation, and don't need to use MPI to do so.
Then there's OpenMP and other threaded applications that don't need
mpirun/mpiexec to launch them.
Limiting the number of processes or threads is not the only concern. You
can easily run a single-threaded tasks that hogs all the RAM. Or a user
may use bbcp to transfer a large amount of data, choking the network
interface.
Using cgroups is really the only reliable way to limit users, and
Arbiter seems like the best way to automatically mange cgroup imposed
limits.
I haven't used arbiter myself, but I've seen presentations on it, and
I'm preparing to deploy it myself.
https://dylngg.github.io/resources/arbiterTechPaper.pdf
Prentice
On 4/25/21 3:46 AM, Patrick Begou wrote:
> Hi,
>
> I also saw a cluster setup where mpirun or mpiexec commands were
> replaced by a shell script just saying "please use srun or sbatch...".
>
> Patrick
>
> Le 24/04/2021 à 10:03, Ole Holm Nielsen a écrit :
>> On 24-04-2021 04:37, Cristóbal Navarro wrote:
>>> Hi Community,
>>> I have a set of users still not so familiar with slurm, and yesterday
>>> they bypassed srun/sbatch and just ran their CPU program directly on
>>> the head/login node thinking it would still run on the compute node.
>>> I am aware that I will need to teach them some basic usage, but in
>>> the meanwhile, how have you solved this type of user-behavior
>>> problem? Is there a preffered way to restrict the master/login
>>> resources, or actions, to the regular users ?
>> We restrict user limits in /etc/security/limits.conf so users can't
>> run very long or very big tasks on the login nodes:
>>
>> # Normal user limits
>> * hard cpu 20
>> * hard rss 50000000
>> * hard data 50000000
>> * soft stack 40000000
>> * hard stack 50000000
>> * hard nproc 250
>>
>> /Ole
>>
>
More information about the slurm-users
mailing list