linux-kernel - Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4509B5A4.2070508@mbligh.org>
Date:	Thu, 14 Sep 2006 13:03:48 -0700
From:	Martin Bligh <mbligh@...igh.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	linux-kernel@...r.kernel.org,
	Christoph Hellwig <hch@...radead.org>,
	Andrew Morton <akpm@...l.org>, Ingo Molnar <mingo@...hat.com>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tom Zanussi <zanussi@...ibm.com>, ltt-dev@...fik.org,
	Michel Dagenais <michel.dagenais@...ymtl.ca>
Subject: Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108

Ingo Molnar wrote:
> * Martin J. Bligh <mbligh@...igh.org> wrote:
> 
> 
>>>>Comments and reviews are very welcome.
>>>
>>>i have one very fundamental question: why should we do this 
>>>source-intrusive method of adding tracepoints instead of the 
>>>dynamic, unintrusive (and thus zero-overhead) KProbes+SystemTap 
>>>method?
>>
>>Because:
>>
>>1. Kprobes are more overhead when they *are* being used.
> 
> 
> minimally so - at least on i386 and x86_64. In that sense tracing is a 
> _slowpath_, and it _will_ slow things down if done excessively. I dont 
> care about the tracepoint being slower by a few instructions as long as 
> it has _zero effect_ on normal code, be that source code or binary code.

Would be interesting to see some measurements. But jumping is slower
than a simple branch (or noops to skip over that can be overwritten).

>>2. You can get zero overhead by CONFIG'ing things out.
> 
> but that's not how a fair chunk of people want to use tracing. People 
> (enterprise customers trying to figure out performance problems, 
> engineers trying to debug things on a live, production system) want to 
> be able to insert a tracepoint anywhere and anytime - and also they want 
> to have zero overhead from tracing if no tracepoints are used on a 
> system.

I'm fine with that ... "a fair chunk of people" - but it's not everyone,
by any means. We need both static and dynamic tracepoints, in one
infrastructure.

>>3. (most importantly) it's a bitch to maintain tracepoints out
>>   of-tree on a rapidly moving kernel
> 
> wrong: the original demo tracepoints that came with SystemTap still work 
> on the current kernel, because the 'coupling' is loose: based on 
> function names.

And what do those trace? I bet not half the stuff we want to do.
I've been migrating Google's tracepoints around between different
kernel versions, and it's not a mechanical port. Just stupid things
like renaming of functions inside memory reclaim creates pain, for
starters. (shrink_cache/shrink_list, refill_inactive_zone, etc).

> Static tracepoints on the other hand, if added via an external patch, do 
> depend on the target function not moving around and the context of the 
> tracepoint not being changed. (and static tracepoints if in the source 
> all the time are a constant hindrance to development and code 
> readability.)

an external patch is, indeed, pretty useless. Merging a few simple
tracepoints should not be a problem - see blktrace and schedstats,
for instance.

> and of course the big advantage of dynamic probing is its flexibility: 
> you can add add-hoc tracepoints to thousands of functions, instead of 
> having to maintain hundreds (or thousands) of static tracepoints all the 
> time. (and if we wont end up with hundreds/thousands of static 
> tracepoints then it wont be usable enough as a generic solution.)

I wasn't saying that dynamic tracepoints are useless - I agree it's
valuable to add stuff on the fly. But some things are better done
statically.

>>4. I believe kprobes still doesn't have full access to local 
>>variables. 
> 
> wrong: with SystemTap you can probe local variables too (via 
> jprobes/kretprobes, all in the upstream kernel already).

I'll look again, but last time I looked it didn't do this, and
when I spoke to the kprobes/systemtap people at OLS, IIRC they
said it still couldn't.

>>Now (3) is possibly solvable by putting the points in as no-ops 
>>(either insert a few nops or just a marker entry in the symbol 
>>table?), but full dynamic just isn't sustainable. [...]
> 
> i'm not sure i follow. Could you explain where SystemTap has this 
> difficulty?

If you have an extremely limited set of probes, on a static area
of the kernel, then yes, they may work for a long time. But try
tracing something like the scheduler, which people seem to delight
in rewriting every month or two ...

It amuses me that we're so opposed to external patches to the tree
(for perfectly understandable reasons), but we somehow think tracepoints
are magically different and should be maintained out of tree somehow.
You yourself made the argument that it's a maintainance burden to
keep the trace points *in* the tree ... if that's true, how is it
any easier to keep them outside of the tree?

If we really want to, we can still keep the hooks inside the code,
and have them do absolutely nothing at all - putting markers into
the symbol table is pretty much free. It also reuses the well
structured code-sharing mechanism we already have in place - the
linux kernel tree.

I really don't want to deal with all the systemtap crap - I just
want something that works, and I don't particularly care if I have
to recompile the kernel to get it. I know that doesn't suit everyone,
but there are requirements on both sides, and we should not dismiss
each other's requirements out of hand.

Having one consistent consistent collection mechanism for all these
different types of tracing data seems both logical and very important
to me ...

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/