[slurm-users] MaxRSS not showing up in sacct

Brian Andrus toomuchit at gmail.com
Mon Sep 16 22:14:59 UTC 2019


One other thing I noticed is that the contents of the *_job_table has
entries in tres_alloc and tres_req that seem to match types in the
tres_table, but there are no mem entries.
For example, tres_table=
+---------------+---------+------+----------------+------+
| creation_time | deleted | id   | type           | name |
+---------------+---------+------+----------------+------+
|    1559250721 |       0 |    1 | cpu            |      |
|    1559250721 |       0 |    2 | mem            |      |
|    1559250721 |       0 |    3 | energy         |      |
|    1559250721 |       0 |    4 | node           |      |
|    1559250721 |       0 |    5 | billing        |      |
|    1559250721 |       0 |    6 | fs             | disk |
|    1559250721 |       0 |    7 | vmem           |      |
|    1559250721 |       0 |    8 | pages          |      |
|    1559250721 |       1 | 1000 | dynamic_offset |      |
+---------------+---------+------+----------------+------+

But none of the jobs poplate a value for 2 (mem):
+--------+-------------+------------------------------------+
| id_job | tres_req    | tres_alloc                         |
+--------+-------------+------------------------------------+
|  19779 | 1=1,4=1,5=1 | 1=4,4=1,5=4                        |
|  19780 | 1=1,4=1,5=1 | 1=4,4=1,5=4                        |
|  19781 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
|  19782 | 1=1,4=1,5=1 | 1=16,4=1,5=16                      |
|  19783 | 1=1,4=1,5=1 | 1=16,4=1,5=16                      |
|  19784 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
|  19785 | 1=1,4=1,5=1 | 1=16,4=1,5=16                      |
|  19786 | 1=1,4=1,5=1 | 1=16,4=1,5=16                      |
|  19787 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
|  19788 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
+--------+-------------+------------------------------------+

Brian Andrus


On Mon, Sep 16, 2019 at 2:58 PM Brian Andrus <toomuchit at gmail.com> wrote:

> I have
> JobAcctGatherType       = jobacct_gather/linux
>
> Brian
>
> On Mon, Sep 16, 2019 at 12:40 PM Antony Cleave <antony.cleave at gmail.com>
> wrote:
>
>> Just a quick thought.
>>
>> What is your slurm.conf setting for this?
>>
>> *JobAcctGatherType* is operating system dependent and controls what
>> mechanism is used to collect accounting information. Supported values are
>> *jobacct_gather/linux* (recommended), *jobacct_gather/cgroup* and
>> *jobacct_gather/none* (no information collected).
>>
>> Antony
>>
>>
>> On Mon, 16 Sep 2019, 14:07 Brian Andrus, <toomuchit at gmail.com> wrote:
>>
>>> Yep, the maxrss field is always blank.
>>>
>>> I just checked on a different cluster and have the same result. Jobs
>>> that completed last week even have nothing in that field.
>>>
>>> So how can I troubleshoot this? Is there a way to log the sql queries
>>> made by slurmdbd?
>>>
>>> Brian
>>>
>>> On 9/15/2019 10:29 PM, Christopher Samuel wrote:
>>> > On 9/15/19 4:17 PM, Brian Andrus wrote:
>>> >
>>> >> Are steps required to capture Max RSS?
>>> >
>>> > No, you should see a MaxRSS reported for the batch step, for instance:
>>> >
>>> > $ sacct -j $JOBID -o jobid,jobname,maxrss
>>> >
>>> > All the best,
>>> > Chris
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190916/3250e6ab/attachment.htm>


More information about the slurm-users mailing list