[slurm-users] MaxRSS not showing up in sacct
Brian Andrus
toomuchit at gmail.com
Mon Sep 16 22:14:59 UTC 2019
One other thing I noticed is that the contents of the *_job_table has
entries in tres_alloc and tres_req that seem to match types in the
tres_table, but there are no mem entries.
For example, tres_table=
+---------------+---------+------+----------------+------+
| creation_time | deleted | id | type | name |
+---------------+---------+------+----------------+------+
| 1559250721 | 0 | 1 | cpu | |
| 1559250721 | 0 | 2 | mem | |
| 1559250721 | 0 | 3 | energy | |
| 1559250721 | 0 | 4 | node | |
| 1559250721 | 0 | 5 | billing | |
| 1559250721 | 0 | 6 | fs | disk |
| 1559250721 | 0 | 7 | vmem | |
| 1559250721 | 0 | 8 | pages | |
| 1559250721 | 1 | 1000 | dynamic_offset | |
+---------------+---------+------+----------------+------+
But none of the jobs poplate a value for 2 (mem):
+--------+-------------+------------------------------------+
| id_job | tres_req | tres_alloc |
+--------+-------------+------------------------------------+
| 19779 | 1=1,4=1,5=1 | 1=4,4=1,5=4 |
| 19780 | 1=1,4=1,5=1 | 1=4,4=1,5=4 |
| 19781 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
| 19782 | 1=1,4=1,5=1 | 1=16,4=1,5=16 |
| 19783 | 1=1,4=1,5=1 | 1=16,4=1,5=16 |
| 19784 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
| 19785 | 1=1,4=1,5=1 | 1=16,4=1,5=16 |
| 19786 | 1=1,4=1,5=1 | 1=16,4=1,5=16 |
| 19787 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
| 19788 | 1=1,4=1,5=1 | 1=4,3=18446744073709551614,4=1,5=4 |
+--------+-------------+------------------------------------+
Brian Andrus
On Mon, Sep 16, 2019 at 2:58 PM Brian Andrus <toomuchit at gmail.com> wrote:
> I have
> JobAcctGatherType = jobacct_gather/linux
>
> Brian
>
> On Mon, Sep 16, 2019 at 12:40 PM Antony Cleave <antony.cleave at gmail.com>
> wrote:
>
>> Just a quick thought.
>>
>> What is your slurm.conf setting for this?
>>
>> *JobAcctGatherType* is operating system dependent and controls what
>> mechanism is used to collect accounting information. Supported values are
>> *jobacct_gather/linux* (recommended), *jobacct_gather/cgroup* and
>> *jobacct_gather/none* (no information collected).
>>
>> Antony
>>
>>
>> On Mon, 16 Sep 2019, 14:07 Brian Andrus, <toomuchit at gmail.com> wrote:
>>
>>> Yep, the maxrss field is always blank.
>>>
>>> I just checked on a different cluster and have the same result. Jobs
>>> that completed last week even have nothing in that field.
>>>
>>> So how can I troubleshoot this? Is there a way to log the sql queries
>>> made by slurmdbd?
>>>
>>> Brian
>>>
>>> On 9/15/2019 10:29 PM, Christopher Samuel wrote:
>>> > On 9/15/19 4:17 PM, Brian Andrus wrote:
>>> >
>>> >> Are steps required to capture Max RSS?
>>> >
>>> > No, you should see a MaxRSS reported for the batch step, for instance:
>>> >
>>> > $ sacct -j $JOBID -o jobid,jobname,maxrss
>>> >
>>> > All the best,
>>> > Chris
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190916/3250e6ab/attachment.htm>
More information about the slurm-users
mailing list