[slurm-users] [External] Re: What is an easy way to prevent users run programs on the master/login node.
wagner at itc.rwth-aachen.de
Thu May 20 05:13:37 UTC 2021
you are right, and I looked into the wrapper script (not my part, never did anything in that thing).
In fact the mpi processes are spawned on the backend nodes, the only process remaining on the login/frontend node is the spawner process.
The wrapper checks the load on the nodes and then creates a corresponding hostfile:
Host nrm214: current load 0.53 => 96 slots left
Host nrm215: current load 0.14 => 96 slots left
Host nrm212: current load 0.09 => 96 slots left
Host nrm213: current load 0.13 => 96 slots left
nrm214 0 (current load is: 0.53)
nrm215 0 (current load is: 0.14)
nrm212 2.0 (current load is: 0.09)
nrm213 0 (current load is: 0.13)
Writing to /tmp/mw445520/login_60004/hostfile-613910
And then spawns the job:
Command: /opt/intel/impi/2018.4.274/compilers_and_libraries/linux/mpi/bin64/mpirun -launcher ssh -machinefile /tmp/mw445520/login_60004/hostfile-63375 -np 2 <code>
I hope to have cleared things up a little bit.
Am 27.04.2021 um 17:48 schrieb Prentice Bisbal:
> But won't that first process be able to use 100% of a core? What if enough users do this such that every core is at 100% utilization? Or, what if the application is MPI + OpenMP? In that case, that one process on the login node could spawn multiple threads that use the remaining cores on the login node.
> On 4/26/21 2:01 AM, Marcus Wagner wrote:
>> we also have a wrapper script, together with a number of "MPI-Backends".
>> If mpiexec is called on the login nodes, only the first process is started on the login node, the rest runs on the MPI backends.
>> Am 25.04.2021 um 09:46 schrieb Patrick Begou:
>>> I also saw a cluster setup where mpirun or mpiexec commands were
>>> replaced by a shell script just saying "please use srun or sbatch...".
>>> Le 24/04/2021 à 10:03, Ole Holm Nielsen a écrit :
>>>> On 24-04-2021 04:37, Cristóbal Navarro wrote:
>>>>> Hi Community,
>>>>> I have a set of users still not so familiar with slurm, and yesterday
>>>>> they bypassed srun/sbatch and just ran their CPU program directly on
>>>>> the head/login node thinking it would still run on the compute node.
>>>>> I am aware that I will need to teach them some basic usage, but in
>>>>> the meanwhile, how have you solved this type of user-behavior
>>>>> problem? Is there a preffered way to restrict the master/login
>>>>> resources, or actions, to the regular users ?
>>>> We restrict user limits in /etc/security/limits.conf so users can't
>>>> run very long or very big tasks on the login nodes:
>>>> # Normal user limits
>>>> * hard cpu 20
>>>> * hard rss 50000000
>>>> * hard data 50000000
>>>> * soft stack 40000000
>>>> * hard stack 50000000
>>>> * hard nproc 250
Dipl.-Inf. Marcus Wagner
Gruppe: Systemgruppe Linux
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
Social Media Kanäle des IT Centers:
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5326 bytes
Desc: S/MIME Cryptographic Signature
More information about the slurm-users