lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090428065507.GA2024@elte.hu>
Date:	Tue, 28 Apr 2009 08:55:07 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Wu Fengguang <fengguang.wu@...el.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Frédéric Weisbecker <fweisbec@...il.com>,
	Larry Woodman <lwoodman@...hat.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Eduard - Gabriel Munteanu <eduard.munteanu@...ux360.ro>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andi Kleen <andi@...stfloor.org>,
	Matt Mackall <mpm@...enic.com>,
	Alexey Dobriyan <adobriyan@...il.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags


* Wu Fengguang <fengguang.wu@...el.com> wrote:

> Export 9 page flags in /proc/kpageflags, and 8 more for kernel developers.
> 
> 1) for kernel hackers (on CONFIG_DEBUG_KERNEL)
>    - all available page flags are exported, and
>    - exported as is
> 2) for admins and end users
>    - only the more `well known' flags are exported:
> 	11. KPF_MMAP		(pseudo flag) memory mapped page
> 	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
> 	13. KPF_SWAPCACHE	page is in swap cache
> 	14. KPF_SWAPBACKED	page is swap/RAM backed
> 	15. KPF_COMPOUND_HEAD	(*)
> 	16. KPF_COMPOUND_TAIL	(*)
> 	17. KPF_UNEVICTABLE	page is in the unevictable LRU list
> 	18. KPF_HWPOISON	hardware detected corruption
> 	19. KPF_NOPAGE		(pseudo flag) no page frame at the address
> 
> 	(*) For compound pages, exporting _both_ head/tail info enables
> 	    users to tell where a compound page starts/ends, and its order.
> 
>    - limit flags to their typical usage scenario, as indicated by KOSAKI:
> 	- LRU pages: only export relevant flags
> 		- PG_lru
> 		- PG_unevictable
> 		- PG_active
> 		- PG_referenced
> 		- page_mapped()
> 		- PageAnon()
> 		- PG_swapcache
> 		- PG_swapbacked
> 		- PG_reclaim
> 	- no-IO pages: mask out irrelevant flags
> 		- PG_dirty
> 		- PG_uptodate
> 		- PG_writeback
> 	- SLAB pages: mask out overloaded flags:
> 		- PG_error
> 		- PG_active
> 		- PG_private
> 	- PG_reclaim: mask out the overloaded PG_readahead
> 	- compound flags: only export huge/gigantic pages
> 
> Here are the admin/linus views of all page flags on a newly booted nfs-root system:
> 
> # ./page-types # for admin
>          flags  page-count       MB  symbolic-flags                     long-symbolic-flags
> 0x000000000000      491174     1918  ____________________________                
> 0x000000000020           1        0  _____l______________________       lru      
> 0x000000000028        2543        9  ___U_l______________________       uptodate,lru
> 0x00000000002c        5288       20  __RU_l______________________       referenced,uptodate,lru
> 0x000000004060           1        0  _____lA_______b_____________       lru,active,swapbacked

I think i have to NAK this kind of ad-hoc instrumentation of kernel 
internals and statistics until we clear up why such instrumentation 
measures are being accepted into the MM while other, more dynamic 
and more flexible MM instrumentation are being resisted by Andrew.

The above type of condensed information can be built out of dynamic 
trace data too - and much more. Being able to track page state 
transitions is very valuable when debugging VM problems. One such 
'view' of trace data would be a summary histogram like above.

( done after a "echo 3 > /proc/sys/vm/drop_caches" to make sure all 
  interesting pages have been re-established and their state is 
  present in the trace. )

The SLAB code already has such a facility, kmemtrace: it's very 
useful and successful in visualizing complex SLAB details, both 
dynamically and statically.

I think the same general approach should be used for the page 
allocator too (and for the page cache and some other struct page 
based caches): the life-time of an object should be followed. If we 
capture the important details we capture the big picture too. Pekka 
already sent an RFC patch to extend kmemtrace in such a fashion. Why 
is that more useful method not being pursued?

By extending upon the (existing) /proc/kpageflags hack a usecase is 
taken away from the tracing based solution and a needless overlap is 
created - and that's not particularly helpful IMHO. We now have all 
the facilities upstream that allow us to do intelligent 
instrumentation - we should make use of them.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ