lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6bb58fe4-d860-555e-3fb9-17b4ab552da6@yandex-team.ru>
Date:   Fri, 17 May 2019 14:42:24 +0300
From:   Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:     Roman Gushchin <guro@...com>
Cc:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] proc/meminfo: add KernelMisc counter

On 16.05.2019 20:59, Roman Gushchin wrote:
> On Wed, May 15, 2019 at 02:49:48PM +0300, Konstantin Khlebnikov wrote:
>> Some kernel memory allocations are not accounted anywhere.
>> This adds easy-read counter for them by subtracting all tracked kinds.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
> 
> We have something similar in userspace, and it was very useful several times.
> In our case, it was mostly vmallocs and percpu stuff (which are now shown
> in meminfo), but for sure there are other memory users who are not.
> 
> I don't particularly like the proposed name, but have no better ideas.
> It's really a gray area, everything we know, it's that the memory is occupied
> by something.
> 

Probably it's better to add overall 'MemKernel'.
Detailed analysis anyway requires special tools.

>> ---
>>   Documentation/filesystems/proc.txt |    2 ++
>>   fs/proc/meminfo.c                  |   41 +++++++++++++++++++++++++-----------
>>   2 files changed, 30 insertions(+), 13 deletions(-)
>>
>> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
>> index 66cad5c86171..f11ce167124c 100644
>> --- a/Documentation/filesystems/proc.txt
>> +++ b/Documentation/filesystems/proc.txt
>> @@ -891,6 +891,7 @@ VmallocTotal:   112216 kB
>>   VmallocUsed:       428 kB
>>   VmallocChunk:   111088 kB
>>   Percpu:          62080 kB
>> +KernelMisc:     212856 kB
>>   HardwareCorrupted:   0 kB
>>   AnonHugePages:   49152 kB
>>   ShmemHugePages:      0 kB
>> @@ -988,6 +989,7 @@ VmallocTotal: total size of vmalloc memory area
>>   VmallocChunk: largest contiguous block of vmalloc area which is free
>>         Percpu: Memory allocated to the percpu allocator used to back percpu
>>                 allocations. This stat excludes the cost of metadata.
>> +  KernelMisc: All other kinds of kernel memory allocaitons
>                                                         ^^^
> 						       typo
>>   
>>   ..............................................................................
>>   
>> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
>> index 568d90e17c17..7bc14716fc5d 100644
>> --- a/fs/proc/meminfo.c
>> +++ b/fs/proc/meminfo.c
>> @@ -38,15 +38,21 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>>   	long cached;
>>   	long available;
>>   	unsigned long pages[NR_LRU_LISTS];
>> -	unsigned long sreclaimable, sunreclaim;
>> +	unsigned long sreclaimable, sunreclaim, misc_reclaimable;
>> +	unsigned long kernel_stack_kb, page_tables, percpu_pages;
>> +	unsigned long anon_pages, file_pages, swap_cached;
>> +	long kernel_misc;
>>   	int lru;
>>   
>>   	si_meminfo(&i);
>>   	si_swapinfo(&i);
>>   	committed = percpu_counter_read_positive(&vm_committed_as);
>>   
>> -	cached = global_node_page_state(NR_FILE_PAGES) -
>> -			total_swapcache_pages() - i.bufferram;
>> +	anon_pages = global_node_page_state(NR_ANON_MAPPED);
>> +	file_pages = global_node_page_state(NR_FILE_PAGES);
>> +	swap_cached = total_swapcache_pages();
>> +
>> +	cached = file_pages - swap_cached - i.bufferram;
>>   	if (cached < 0)
>>   		cached = 0;
>>   
>> @@ -56,13 +62,25 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>>   	available = si_mem_available();
>>   	sreclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE);
>>   	sunreclaim = global_node_page_state(NR_SLAB_UNRECLAIMABLE);
>> +	misc_reclaimable = global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE);
>> +	kernel_stack_kb = global_zone_page_state(NR_KERNEL_STACK_KB);
>> +	page_tables = global_zone_page_state(NR_PAGETABLE);
>> +	percpu_pages = pcpu_nr_pages();
>> +
>> +	/* all other kinds of kernel memory allocations */
>> +	kernel_misc = i.totalram - i.freeram - anon_pages - file_pages
>> +		      - sreclaimable - sunreclaim - misc_reclaimable
>> +		      - (kernel_stack_kb >> (PAGE_SHIFT - 10))
>> +		      - page_tables - percpu_pages;
>> +	if (kernel_misc < 0)
>> +		kernel_misc = 0;
> 
> Hm, why? Is there any realistic scenario (not caused by the kernel doing
> the memory accounting wrong) when it's negative?
> 
> Maybe it's better to show it as it is, if it's negative? Because
> it might be a good indication that something's wrong with some of
> the counters.

Such kind of sanitisation is a common practice for racy counters.
See 'cached' above.

> 
> Thanks!
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ