[slurm-users] RLIMIT_NPROCS
Wagner, Marcus
wagner at itc.rwth-aachen.de
Thu Mar 23 11:58:23 UTC 2023
Hi Hermann,
no, we don't use --propagate.
in slurm.conf, we set
PropagateResourceLimits=CORE
That in fact means, that we really do not propagate any limits besides
the coresize (excerpt from slurm.conf manpage):
> If neither PropagateResourceLimits or PropagateResourceLimitsExcept
> are configured and the "--propagate" option is not specified, then
> the default action is to propagate all limits.
So, the maximum number of processes should not be propagated from the
submit nodes to the batch nodes. Moreover, I do not know where that high
limit might come from.
In /etc/security/limits.conf we set
* soft nproc 262144
ulimit -u gives me 16384 on the submit nodes.
the batchjobs are still working as expected, but that "error"-message is
somewhat disturbing.
Best
Marcus
Am 23.03.2023 um 10:01 schrieb Hermann Schwärzler:
> Hi Marcus,
>
> I am not sure if this is helpful but from looking at the source code
> of Slurm (line 276 of src/slurmd/slurmstepd/ulimits.c in version
> 22.05) it looks like you are explicitly using
> "--propagate..."
> to set resource limits (the one you see when running
> "ulimit -a") on the workers the same as on the submit host.
>
> The error "Invalid argument" is returned when Slurm wants to set the
> hard limit lower than the (default?) soft limit (in this particular
> case for the maximum number of processes
> ("ulimit -u")).
> Maybe your hard limit for that on the submit host is configured to be
> lower than it is on the worker nodes; Slurm gets this error and shows
> it to you as you were using the --propagate option?
>
> Regards,
> Hermann
>
>
> On 3/23/23 08:00, Wagner, Marcus wrote:
>> Hi Folks,
>>
>> has anyone ever stumbled upon such an error:
>>
>> slurmstepd: error: Can't propagate RLIMIT_NPROC of 767202 from submit
>> host: Invalid argument
>>
>>
>> Anyone knows, where that comes from?
>> Any hints are welcome.
>>
>>
>> Best
>> Marcus
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5326 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230323/21e3716b/attachment.bin>
More information about the slurm-users
mailing list