[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090903192257.GA25363@elte.hu>
Date: Thu, 3 Sep 2009 21:22:57 +0200
From: Ingo Molnar <mingo@...e.hu>
To: "K.Prasad" <prasad@...ux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Lai Jiangshan <laijs@...fujitsu.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
Alan Stern <stern@...land.harvard.edu>,
Paul Mackerras <paulus@...ba.org>,
David Gibson <dwg@....ibm.com>
Subject: Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware
Breakpoint requests
* K.Prasad <prasad@...ux.vnet.ibm.com> wrote:
> On Wed, Sep 02, 2009 at 01:51:33AM +0200, Frederic Weisbecker wrote:
> > On Tue, Sep 01, 2009 at 12:08:45PM +0530, K.Prasad wrote:
> > > On Sat, Aug 29, 2009 at 03:41:07PM +0200, Ingo Molnar wrote:
> > > >
> > > > * K.Prasad <prasad@...ux.vnet.ibm.com> wrote:
> > > >
> > > > > I am not sure if pmus can handle, (or want to handle) all the
> > > > > intricacies involved with the hw-breakpoint layer [...]
> > > >
> > > > Which are those intricacies? It's all rather straightforward
> > > > register scheduling and reservation stuff - which perfcounters
> > > > already solves in a very rich way.
> > > >
> > > > Ingo
> > >
> [edited]
> > > And post integration, in-kernel users like ptrace, kgdb* and xmon*
> > > which hitherto have interacted directly with the debug registers
> > > (through set_debugreg()/set_dabr()) should route their requests through the
> > > perf-layer. It is difficult to imagine ptrace's idempotent requests
> > > (through ptrace_<get><set>_debugreg()) having to pass through perf-layer
> > > (and becoming dependant on CONFIG_PERF_COUNTERS), not to mention the
> > > tricks required to synchronise signal generation timing with exception
> > > behaviour (especially on PPC64).
> > > * - Not converted to use hw-breakpoint layer yet
> >
> >
> > Actually, I see the perf layer here as a middle man between
> >
> > - the very hardware stuff (dr[0-467]) handling, reading, writing, updating
> > - the core API (register_kernel_breakpoint(), register_user_breakpoint() etc..)
> >
> > And this middle man can handle so much things on its own that the two above
> > gets utterly shrinked.
> >
> > Also the ptrace thing is tricky in itself, and that can't be helped easily.
> > Because of the direct writing to debug registers done by POKE_USR,
> > whatever the current breakpoint API with or without perf integration, we still
> > need subterfuges to carry it.
> >
>
> The reverse-dependancy this would create over perf (CONFIG_PERF) for the
> hw-breakpoint layer is an undesirable side-effect, and gives rise to
> atleast two immediate questions:
>
> - Handling of requests for hw-breakpoint from users like ptrace when
> CONFIG_PERF is not turned on
This is basically just a build/layering logistics question and it is
solved easily - we could have a library mode for it.
> - Managing 'register scheduling and reservation' on architectures where
> perf layer isn't ported. An inefficient way of handling this would be
> to retain the existing register allocation code of hw-breakpoint for
> such architectures - thereby artificially imposing arch-specific code
> into generic stuff.
Minimally porting perf to enable a hw-breakpoints PMU extension is
very easy in practice. For example on s390 it took just 15 lines of
code:
12310e9: [S390] Enable tick based perf_counter on s390.
arch/s390/Kconfig | 1 +
arch/s390/include/asm/perf_counter.h | 8 ++++++++
tools/perf/perf.h | 6 ++++++
3 files changed, 15 insertions(+), 0 deletions(-)
On FRV it took 38 lines (60% of which are boilerplace copyright
notices), on PARISC 15 lines.
By far the most complexity is in factoring out the hw-breakpoint
code itself - and that has to be done regardless of the register
scheduling model.
> A solution here would be to detach parts of perf layer's code that
> handle register scheduling and reservation (which I learn are in
> kernel/perf_counter.c) into a separate entity (outside the ambit
> of CONFIG_PERF) that can serve the needs of both hw-breakpoint and
> perf thereby eliminating the two issues enumerated above.
>
> The tight coupling between the functions that perform register
> scheduling (in kernel/perf_counter.c) and perf's data structures
> is quite apparent and does suggest non-trivial amount of effort to
> detach them into a layer of its own.
>
> However this might be quite necessary in order to balance between
> a desire to re-use the 'register scheduling and reservation' code
> of perf-layer while not running into issues as above.
>
> This, along with the framework (described in the previous mail) to
> retain the hw-breakpoint's APIs + code interacting with debug
> registers (including exception handling) would be a good
> compromise.
I dont think the librarization is all that complex. It's very much
desired, as we'd reuse an existing piece of infrastructure to
implement another one - this is always good.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists