[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071031163623.GA4766@Krystal>
Date: Wed, 31 Oct 2007 12:36:24 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: "Frank Ch. Eigler" <fche@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Jeff Garzik <jeff@...zik.org>,
Randy Dunlap <rdunlap@...otime.net>, hch@...radead.org,
linux-kernel@...r.kernel.org, Sam Ravnborg <sam@...nborg.org>,
Jens Axboe <jens.axboe@...cle.com>,
Prasanna S Panchamukhi <prasanna@...ibm.com>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Anil S Keshavamurthy <anil.s.keshavamurthy@...el.com>,
"David S. Miller" <davem@...emloft.net>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <pzijlstr@...hat.com>,
Philippe Elie <phil.el@...adoo.fr>,
"William L. Irwin" <wli@...omorphy.com>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Lameter <christoph@...eter.com>,
Valdis.Kletnieks@...edu
Subject: Re: [RFC] Create kinst/ or ki/ directory ?
* Frank Ch. Eigler (fche@...hat.com) wrote:
> Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> writes:
>
> > [...] SystemTAP, for instance, does it out of tree by keeping a
> > separate list of address where kprobes must be installed. It does
> > the job on a distribution kernel maintainer perspective (Redhat),
> > since they freeze to a particular kernel version and update this
> > list every time it breaks, but will always be a source of
> > frustration for vanilla kernel users and kernel developers.
>
> This misstates the details. What systemtap has out-of-tree is a list
> of kernel function names (and parameter names), not addresses. This
> list does change somewhat with kernel versions, but we generally keep
> up. We do test with vanilla kernels, and several non-RH distributors
> test with their kernels. It is a problem, but it is manageable.
>
That's right, Systemtap uses symbols, thanks for the clarification. But
my point is still valid: SystemTAP expects function names and argument
names to stay unchanged, therefore using the kernel code itself as an
API to userspace tools. The markers act as a buffer between what
important events userspace tools expect and the actual kernel code.
>
> > I think the best way to follow the code flow is to add markup in the
> > code itself: it would follow the kernel HEAD and let each subsystem
> > maintainer identify the key instrumentation sites of their
> > subsystem.
>
> Of course - when and where the dormant overheads are acceptable, and
I have not been able to detect a significant dormant marker overhead
with the immediate values optimization. A load immediate and a predicted
conditional jump are surprisingly cheap.
> where the maintainers are willing to commit to a long-term interface
> (marker name/arguments).
Yes. I expect that kind of mark-up to be kept minimalistic.
> Systemtap can connect to markers as well as
> to kprobes and other event sources: mix & match based on what's
> available in your particular kernel and what data/computation you
> want.
>
I think that SystemTAP's flexibility is great, but leads to fagileness
wrt kernel code changes. If the "core events" required by SystemTAP
(and also by LTTng by the way) could be turned into markers, I think it
would gain in robustness.
Providing the ability to instrument code locations with breakpoints, in
addition to this, will help users unsatisfied with the information
they have, unwilling to recompile their kernel or modules with their
own markers, ready to accept the two limitations :
- performance hit of a breakpoint
- unability to access variables within optimized functions
So yes, both approaches seems to be complementary.
>
> > About extending on ptrace, I am sorry to say that this solution has
> > the same downsides as kprobes: it is too slow for high performance
> > applications, especially if turned on system-wide. [...]
>
> Roland McGrath's ptrace-replacement (utrace) should help with this.
>
Yes, I think he did a good job at it. However, it is not a replacement
for the markers, SystemTAP or LTTng, because it defines a limited
set of hardcoded events (implying yet another type of code markup) that
is by itself a pain to extend. I am not willing to ask a subsystem
maintainer to do more than to "just identify" their important code
paths, the equivalent of adding a printk to their code. I don't think it
is realistic to ask them to create specialized callbacks for each of the
sites they would like to instrument.
So I would say : I'll try to submit a core set of markers patches for
review on LKML and see what people have to say.
Mathieu
>
> - FChE
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists