linux-kernel - Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <450A4F88.2050409@us.ibm.com>
Date:	Fri, 15 Sep 2006 00:00:24 -0700
From:	Vara Prasad <prasadav@...ibm.com>
To:	Martin Bligh <mbligh@...igh.org>
CC:	Ingo Molnar <mingo@...e.hu>, Roman Zippel <zippel@...ux-m68k.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	linux-kernel@...r.kernel.org,
	Christoph Hellwig <hch@...radead.org>,
	Andrew Morton <akpm@...l.org>, Ingo Molnar <mingo@...hat.com>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tom Zanussi <zanussi@...ibm.com>, ltt-dev@...fik.org,
	Michel Dagenais <michel.dagenais@...ymtl.ca>, fche@...hat.com,
	systemtap <systemtap@...rceware.org>
Subject: Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108

Martin Bligh wrote:

> Ingo Molnar wrote:
>
>> * Martin Bligh <mbligh@...igh.org> wrote:
>>
>>>> i very much agree that they should become as fast as possible. So 
>>>> to rephrase the question: can we make dynamic tracepoints as fast 
>>>> (or nearly as fast) as static tracepoints? If yes, should we care 
>>>> about static tracers at all?
>>>
>>>
>>> Depends how many nops you're willing to add, I guess. Anything, even 
>>> the static tracepoints really needs at least a branch to be useful, 
>>> IMHO. At least for what I've been doing with it, you need to stop 
>>> the data flow after a while (when the event you're interested in 
>>> happens, I'm using it like a flight data recorder, so we can go back 
>>> and do postmortem on what went wrong). I should imagine branch 
>>> prediction makes it very cheap on most modern CPUs, but don't have 
>>> hard data to hand.
>>
>>
>> only 5 bytes of NOP are needed by default, so that a kprobe can 
>> insert a call/callq instruction. The easiest way in practice is to 
>> insert a _single_, unconditional function call that is patched out to 
>> NOPs upon its first occurance (doing this is not a performance issue 
>> at all). That way the only cost is the NOP and the function parameter 
>> preparation side-effects. (which might or might not be significant - 
>> with register calling conventions and most parameters being readily 
>> available it should be small.)
>>
>> note that such a limited, minimally invasive 'data extraction point' 
>> infrastructure is not actually what the LTT patches are doing. It's 
>> not even close, and i think you'll be surprised. Let me quote from 
>> the latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same 
>> version submitted to lkml - although no specific tracepoints were 
>> submitted):
>
>
> OK, I grant you that's pretty scary ;-) However, it's not the only way
> to do it. Most things we're using write a statically sized 64-bit event
> into a relayfs buffer, with a timestamp, a minor and major event type,
> and a byte of data payload.
>
>> believe it or not, this is inlined into: kernel/sched.c ...
>>
>> 'enuff said. LTT is so far from being even considerable that it's not 
>> even funny.
>
>
> Particularly if we're doing more complex things like that, I'd agree
> that the overhead of doing the out of line jump is non-existant by
> comparison. Even with the relayfs logging alone, perhaps the jump is
> not that heavy ... hmmm.
>
> If we put the NOPs in (at least as an option on some architectures)
> from a macro, you don't really need the full kprobes implemented to
> to tracing, even ... just overwrite the nops with a jump, so presumably
> would be easier to port. However, not sure how local variable data
> is specified in that case ... perhaps the kprobes guys know better.
> Most of the complexity seemed to be with relocating existing code
> because you didn't have nops.


With kprobes one can place probes anywhere you want but the ones placed 
in the middle of the function are not maintainable because they are tied 
to a location in the code.  Having a NOP leaves a maintainable address 
that we can hook into when needed. 

AFAIK writing a portable code for using local variables is not easy 
without using DWARF information, hence we don't handle that in kprobes. 
Jprobes is a special case where you can have access to function 
arguments at the function entry point. SystemTap can be used to specify 
probes anywhere in the function and local variables can also be used in 
the probe handlers. The problem still is maintainability as probes are 
specified using line numbers.

>
> To me, the main thing is to have hooks for the at least some of the
> basic needs maintained in-kernel - from the dtrace paper Val pointed
> me to, that seems to be exactly what they do too, and it integrates
> with the newly added dynamic ones where necessary. 


Once we have these static markers one can use both dynamic probes and 
static probes intermixed getting best of both worlds as Frank 
demonstrated in OLS.

Here are couple of proposals that were discussed in the systemtap 
mailing list in how to specify static markers, we could use these ideas 
with the rest in deciding on a maker proposal.
http://sources.redhat.com/ml/systemtap/2006-q3/msg00273.html
http://sourceware.org/ml/systemtap/2005-q4/msg00415.html

> Plus I hate the
> whole awk thing, and general complexity of systemtap, but we can
> probably avoid that easily enough - either the embedded C option
> you mentioned, or just a different definiton for the same hook macros
> under a config option.
>
> So perhaps it'll all work. Still need a little bit of data maintained
> in tree though.


For placing probes at the begin and end of function we don't really need 
markers as function boundary works as a marker.
I think we only need markers in few places where an important decision 
is made in the middle of a function.

>
> M.
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/