lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090728224805.GB5104@nowhere>
Date:	Wed, 29 Jul 2009 00:48:06 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Mel Gorman <mel@....ul.ie>, Steven Rostedt <rostedt@...dmis.org>,
	Li Zefan <lizf@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Eduard - Gabriel Munteanu <eduard.munteanu@...ux360.ro>
Cc:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 0/3] Add some trace events for the page allocator

On Tue, Jul 28, 2009 at 11:23:36PM +0100, Mel Gorman wrote:
> The following three patches add some trace events for the page allocator under
> the heading of kmem (should there be a pagealloc heading instead?). Testing
> under qemu seems to show up reasonable results but this is a prototype for
> comment that hasn't been very heavily tested. I was able to find at least
> one anomaly looking a the output in relation to anti-fragmentation which
> I'm still thinking about so minimally, it was useful for that but I've made
> an attempt to justify each of the events added.
> 
> The patches are as follows
> 
> 	Patch 1 adds events for plain old allocate and freeing of pages
> 	Patch 2 gives information useful for analysing fragmentation avoidance
> 	Patch 3 tracks pages going to and from the buddy lists as an indirect
> 		indication of zone lock hotness
> 
> The first one could be used as an indicator as to whether the workload was
> heavily dependant on the page allocator or not. You can make a guess based
> on vmstat but you can't get a per-process breakdown. I did have trouble with
> the call-site portion of the allocation. Depending on the path, you might
> just get the address of __get_free_pages() instead of a useful callsite. I
> didn't see a nice way to always report a "useful" call_site.
> 
> The second patch would mainly be useful for users of hugepages and
> particularly dynamic hugepage pool resizing as it could be used to tune
> min_free_kbytes to a level that fragmentation was rarely a problem. My
> main concern is that maybe I'm trying to jam too much into the TP_printk
> that could be extrapolated after the fact if you were familiar with the
> implementation. I couldn't determine if it was best to hold the hand of
> the administrator even if it cost more to figure it out.
> 
> The last patch is trickier to draw conclusions from but high activity on
> those events could explain why there were a large number of cache misses
> on a page-allocator-intensive workload. The coalescing and splitting of
> buddies involves a lot of writing of page metadata and cache line bounces
> not to mention the acquisition of an interrupt-safe lock necessary to enter
> this path. One problem is that one function traced is likely to change its
> name in the future.  When that happens, the trace event will be replaced
> with something similar, but not identical. I've been told this is probably
> ok but there has been whinging in the past about whether debugfs represents
> an ABI or not.
> 
> This is the first time I've looked at adding trace events so apologies
> for any obvious mistakes made as I haven't been keeping a close eye on all
> the tracing discussions describing How Things Should Be Done. checkpatch
> throws major wobblies about this patchset, but it's consistent with the
> style of other events so I ignored it. The "To:" list is based taken from
> another tracepoint mail, if there is a specific list I should have used,
> feel free to slap with clue stick. All comments indicating whether this is
> generally useful and how it might be improved are welcome.


(Adding some other tracing + slab allocator/kmemtrace people in Cc)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ