[slurm-users] RLIMIT_NPROCS

Wagner, Marcus wagner at itc.rwth-aachen.de
Thu Mar 23 11:58:23 UTC 2023


Hi Hermann,

no, we don't use --propagate.

in slurm.conf, we set

PropagateResourceLimits=CORE

That in fact means, that we really do not propagate any limits besides 
the coresize (excerpt from slurm.conf manpage):

> If neither PropagateResourceLimits or PropagateResourceLimitsExcept 
> are configured and the "--propagate" option is not specified, then 
> the  default  action is  to  propagate all limits.

So, the maximum number of processes should not be propagated from the 
submit nodes to the batch nodes. Moreover, I do not know where that high 
limit might come from.

In /etc/security/limits.conf we set
*               soft    nproc 262144

ulimit -u gives me 16384 on the submit nodes.

the batchjobs are still working as expected, but that "error"-message is 
somewhat disturbing.

Best
Marcus

Am 23.03.2023 um 10:01 schrieb Hermann Schwärzler:
> Hi Marcus,
>
> I am not sure if this is helpful but from looking at the source code 
> of Slurm (line 276 of src/slurmd/slurmstepd/ulimits.c in version 
> 22.05) it looks like you are explicitly using
> "--propagate..."
> to set resource limits (the one you see when running
> "ulimit -a") on the workers the same as on the submit host.
>
> The error "Invalid argument" is returned when Slurm wants to set the 
> hard limit lower than the (default?) soft limit (in this particular 
> case for the maximum number of processes
> ("ulimit -u")).
> Maybe your hard limit for that on the submit host is configured to be 
> lower than it is on the worker nodes; Slurm gets this error and shows 
> it to you as you were using the --propagate option?
>
> Regards,
> Hermann
>
>
> On 3/23/23 08:00, Wagner, Marcus wrote:
>> Hi Folks,
>>
>> has anyone ever stumbled upon such an error:
>>
>> slurmstepd: error: Can't propagate RLIMIT_NPROC of 767202 from submit 
>> host: Invalid argument
>>
>>
>> Anyone knows, where that comes from?
>> Any hints are welcome.
>>
>>
>> Best
>> Marcus
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5326 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230323/21e3716b/attachment.bin>


More information about the slurm-users mailing list