[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6bb58fe4-d860-555e-3fb9-17b4ab552da6@yandex-team.ru>
Date: Fri, 17 May 2019 14:42:24 +0300
From: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To: Roman Gushchin <guro@...com>
Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] proc/meminfo: add KernelMisc counter
On 16.05.2019 20:59, Roman Gushchin wrote:
> On Wed, May 15, 2019 at 02:49:48PM +0300, Konstantin Khlebnikov wrote:
>> Some kernel memory allocations are not accounted anywhere.
>> This adds easy-read counter for them by subtracting all tracked kinds.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
>
> We have something similar in userspace, and it was very useful several times.
> In our case, it was mostly vmallocs and percpu stuff (which are now shown
> in meminfo), but for sure there are other memory users who are not.
>
> I don't particularly like the proposed name, but have no better ideas.
> It's really a gray area, everything we know, it's that the memory is occupied
> by something.
>
Probably it's better to add overall 'MemKernel'.
Detailed analysis anyway requires special tools.
>> ---
>> Documentation/filesystems/proc.txt | 2 ++
>> fs/proc/meminfo.c | 41 +++++++++++++++++++++++++-----------
>> 2 files changed, 30 insertions(+), 13 deletions(-)
>>
>> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
>> index 66cad5c86171..f11ce167124c 100644
>> --- a/Documentation/filesystems/proc.txt
>> +++ b/Documentation/filesystems/proc.txt
>> @@ -891,6 +891,7 @@ VmallocTotal: 112216 kB
>> VmallocUsed: 428 kB
>> VmallocChunk: 111088 kB
>> Percpu: 62080 kB
>> +KernelMisc: 212856 kB
>> HardwareCorrupted: 0 kB
>> AnonHugePages: 49152 kB
>> ShmemHugePages: 0 kB
>> @@ -988,6 +989,7 @@ VmallocTotal: total size of vmalloc memory area
>> VmallocChunk: largest contiguous block of vmalloc area which is free
>> Percpu: Memory allocated to the percpu allocator used to back percpu
>> allocations. This stat excludes the cost of metadata.
>> + KernelMisc: All other kinds of kernel memory allocaitons
> ^^^
> typo
>>
>> ..............................................................................
>>
>> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
>> index 568d90e17c17..7bc14716fc5d 100644
>> --- a/fs/proc/meminfo.c
>> +++ b/fs/proc/meminfo.c
>> @@ -38,15 +38,21 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>> long cached;
>> long available;
>> unsigned long pages[NR_LRU_LISTS];
>> - unsigned long sreclaimable, sunreclaim;
>> + unsigned long sreclaimable, sunreclaim, misc_reclaimable;
>> + unsigned long kernel_stack_kb, page_tables, percpu_pages;
>> + unsigned long anon_pages, file_pages, swap_cached;
>> + long kernel_misc;
>> int lru;
>>
>> si_meminfo(&i);
>> si_swapinfo(&i);
>> committed = percpu_counter_read_positive(&vm_committed_as);
>>
>> - cached = global_node_page_state(NR_FILE_PAGES) -
>> - total_swapcache_pages() - i.bufferram;
>> + anon_pages = global_node_page_state(NR_ANON_MAPPED);
>> + file_pages = global_node_page_state(NR_FILE_PAGES);
>> + swap_cached = total_swapcache_pages();
>> +
>> + cached = file_pages - swap_cached - i.bufferram;
>> if (cached < 0)
>> cached = 0;
>>
>> @@ -56,13 +62,25 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>> available = si_mem_available();
>> sreclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE);
>> sunreclaim = global_node_page_state(NR_SLAB_UNRECLAIMABLE);
>> + misc_reclaimable = global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE);
>> + kernel_stack_kb = global_zone_page_state(NR_KERNEL_STACK_KB);
>> + page_tables = global_zone_page_state(NR_PAGETABLE);
>> + percpu_pages = pcpu_nr_pages();
>> +
>> + /* all other kinds of kernel memory allocations */
>> + kernel_misc = i.totalram - i.freeram - anon_pages - file_pages
>> + - sreclaimable - sunreclaim - misc_reclaimable
>> + - (kernel_stack_kb >> (PAGE_SHIFT - 10))
>> + - page_tables - percpu_pages;
>> + if (kernel_misc < 0)
>> + kernel_misc = 0;
>
> Hm, why? Is there any realistic scenario (not caused by the kernel doing
> the memory accounting wrong) when it's negative?
>
> Maybe it's better to show it as it is, if it's negative? Because
> it might be a good indication that something's wrong with some of
> the counters.
Such kind of sanitisation is a common practice for racy counters.
See 'cached' above.
>
> Thanks!
>
Powered by blists - more mailing lists