linux-kernel - Re: [Patch] mm tracepoints update

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090423084233.GF599@elte.hu>
Date:	Thu, 23 Apr 2009 10:42:33 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Larry Woodman <lwoodman@...hat.com>,
	Fr馘駻ic Weisbecker <fweisbec@...il.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	eduard.munteanu@...ux360.ro, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, riel@...hat.com, rostedt@...dmis.org
Subject: Re: [Patch] mm tracepoints update - use case.

* Andrew Morton <akpm@...ux-foundation.org> wrote:

> On Thu, 23 Apr 2009 09:48:04 +0900 (JST) KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com> wrote:
> 
> > > On Wed, 2009-04-22 at 08:07 -0400, Larry Woodman wrote:
> > > > On Wed, 2009-04-22 at 11:57 +0200, Ingo Molnar wrote:
> > > > > * KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com> wrote:
> > > 
> > > > > > In past thread, Andrew pointed out bare page tracer isn't useful. 
> > > > > 
> > > > > (do you have a link to that mail?)
> 
> http://lkml.indiana.edu/hypermail/linux/kernel/0903.0/02674.html
> 
> And Larry's example use case here tends to reinforce what I said then.  Look:
> 
> : In addition I could see that the priority was decremented to zero and
> : that 12342 pages had been reclaimed rather than just enough to satisfy
> : the page allocation request.
> : 
> : -----------------------------------------------------------------------------
> : # tracer: nop
> : #
> : #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> : #              | |       |          |         |
> : <mem>-10723 [005]  6976.285610: mm_directreclaim_reclaimzone: reclaimed=12342, priority=0
> 
> and
> 
> : -----------------------------------------------------------------------------
> : # tracer: nop
> : #
> : #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> : #              | |       |          |         |
> :            <mem>-10723 [005]   282.776271: mm_pagereclaim_shrinkzone: reclaimed=12342
> :            <mem>-10723 [005]   282.781209: mm_pagereclaim_shrinkzone: reclaimed=3540
> :            <mem>-10723 [005]   282.801194: mm_pagereclaim_shrinkzone: reclaimed=7528
> : -----------------------------------------------------------------------------
> 
> This diagnosis was successful because the "reclaimed" number was 
> weird. By sheer happy coincidence, page-reclaim is already 
> generating the aggregated numbers for us, and the tracer just 
> prints it out.
> 
> If some other problem is being worked on and if there _isn't_ some 
> convenient already-present aggregated result for the tracer to 
> print, the problem won't be solved.  Unless a vast number of trace 
> events are emitted and problem-specific userspace code is written 
> to aggregate them into something which the developer can use.

Not so in the usescases i made use of tracers. The key is not to 
trace everything, but to have a few key _concepts_ traced 
pervasively. Having a dynamic notion of a per event changes is also 
obviously good. In a fast changing workload you cannot just tell 
based on summary statistics whether rapid changes are the product of 
the inherent entropy of the workload, or the result of the MM being 
confused.

/proc/ statisitics versus good tracing is like the difference 
between a magnifying glass and an electron microscope. Both have 
their strengths, and they are best if used together.

One such conceptual thing in the scheduler is the lifetime of a 
task, its schedule, deschedule and wakeup events. It can already 
show a massive amount of badness in practice, and it only takes a 
few tracepoints to do.

Same goes for the MM IMHO. Number of pages reclaimed is obviously a 
key metric to follow. Larry is an expert who fixed a _lot_ of MM 
crap in the last 5-10 years at Red Hat, so if he says that these 
tracepoints are useful to him, we shouldnt just dismiss that 
experience like that. I wish Larry spent some of his energies on 
fixing the upstream MM too ;-)

A balanced number of MM tracepoints, showing the concepts and the 
inner dynamics of the MM would be useful. We dont need every little 
detail traced (we have the function tracer for that), but a few key 
aspects would be nice to capture ...

pagefaults, allocations, cache-misses, cache flushes and how pages 
shift between various queues in the MM would be a good start IMHO.

Anyway, i suspect your answer means a NAK :-( Would be nice if you 
would suggest a path out of that NAK.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/