[slurm-users] [External] Re: PropagateResourceLimits

Thu Apr 29 17:41:01 UTC 2021

So I decided to eat my own dog food, and tested this out myself. First 
of all, running ulimit through srun "naked" like that doesn't work, 
since ulimit is a bash shell builtin, so I had to write a simple shell 
script:

$ cat ulimit.sh

#!/bin/bash

ulimit -a

By default, core is set to zero in our environment as a good security 
practice and to keep user's core dumps from filling up the filesystem. 
My default ulimit settings:

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 128054
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Now I run my ulimit.sh script through srun

$ srun -N1 -n1 -t 00:01:00 --mem=1G ./ulimit.sh
srun: job 1249977 queued and waiting for resources
srun: job 1249977 has been allocated resources
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257092
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) 1048576
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Now I set core size:

$ ulimit -c 1024
(base) [pbisbal at sunfire01 ulimit]$ ulimit -c
1024

And run ulimit.sh through srun again:

$ srun -N1 -n1 -t 00:01:00 --mem=1G ./ulimit.sh
srun: job 1249978 queued and waiting for resources
srun: job 1249978 has been allocated resources
core file size          (blocks, -c) 1024
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257092
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) 1048576
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

This confirms that PropagateResourceLimits comes from the user's 
environment, not PAM. If you have UsePAM enabled as Ryan suggested in a 
previous e-mail, that puts *upper limits* on the values propagated by 
PropagateResourceLimits. According to the slurm.conf man age, it doesn't 
necessarily override the limits set in the environment when the job is 
submitted:

>  UsePAM If set to 1, PAM (Pluggable Authentication  Modules  for  Linux)
>               will  be enabled.  PAM is used to establish the upper 
> bounds for
>               resource limits. With PAM support enabled, local system 
> adminis‐
>               trators can dynamically configure system resource 
> limits. Chang‐
>               ing the upper bound of a resource limit will not alter 
> the  lim‐
>               its  of  running jobs, only jobs started after a change 
> has been
>               made will pick up the new limits.  The default value is  
> 0  (not
>               to enable PAM support)....

So if I set core file size to 0 and /etc/security/limits.conf sets it to 
1024, if UsePAM=1 and PropagateResourceLimits=ALL (the default for that 
setting), core file size will stay 0. If I set it to 2048 and UsePAM=1, 
then Slurm will reduce that limit to 1024.

Note that setting UsePAM=1 alone isn't enough. You need to configure a 
PAM module named slurm, too, as Ryan pointed out.

Prentice

On 4/29/21 12:35 PM, Prentice Bisbal wrote:
>
> On 4/28/21 2:26 AM, Diego Zuccato wrote:
>
>> Il 27/04/2021 17:31, Prentice Bisbal ha scritto:
>>
>>> I don't think PAM comes into play here. Since Slurm is starting the 
>>> processes on the compute nodes as the user, etc., PAM is being 
>>> bypassed.
>> Then maybe slurmd somehow goes throught the PAM stack another way, 
>> since limits on the frontend got propagated (as implied by 
>> PropagateResourceLimits default value of ALL).
>> And I can confirm that setting it to NONE seems to have solved the 
>> issue: users on the frontend get limited resources, and jobs on the 
>> nodes get the resources they asked.
>>
> In this case, Slurm is deliberately looking at the resource limits 
> effect when the job is submitted on the submission host, and then 
> copying them the to job's environment. From the slurm.conf  
> documentation (https://slurm.schedmd.com/slurm.conf.html):
>
>> *PropagateResourceLimits*
>>     A comma-separated list of resource limit names. The slurmd daemon
>>     uses these names to obtain the associated (soft) limit values
>>     from the user's process environment on the submit node. These
>>     limits are then propagated and applied to the jobs that will run
>>     on the compute nodes.'
>>
> Then later on, it indicates that all resource limits are propagated by 
> default:
>
>> The following limit names are supported by Slurm (although some 
>> options may not be supported on some systems):
>>
>> *ALL*
>>     All limits listed below (default)
>>
> You should be able to verify this yourself in the following manner:
>
> 1. Start two separate shells on the submission host
>
> 2. Change the limits in one of the shells. For example, reduce core 
> size to 0, with 'ulimit -c 0' in just one shell.
>
> 3. Then run 'srun ulimit -a' from each shell.
>
> 4. Compare the output. The one shell should show that core size is now 
> zero.
>
> --
>
> Prentice
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210429/01ebe559/attachment-0001.htm>