[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081118163037.GD8088@elte.hu>
Date: Tue, 18 Nov 2008 17:30:37 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc: linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Lai Jiangshan <laijs@...fujitsu.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [patch 06/16] Markers auto enable tracepoints (new API :
trace_mark_tp())
* Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> Markers identify the name (and therefore numeric ID) to attach to an
> "event" and the data types to export into trace buffers for this
> specific event type. These data types are fully expressed in a
> marker format-string table recorded in a "metadata" channel. The
> size of the various basic types and the endianness is recorded in
> the buffer header. Therefore, the binary trace buffers are
> self-described.
>
> Data is exported through binary trace buffers out of kernel-space,
> either by writing directly to disk, sending data over the network,
> crash dump extraction, etc.
Streaming gigabytes of data is really mostly only done when we know
_nothing_ useful about a failure mode and are _forced_ into logging
gobs and gobs of data at great expense.
And thus in reality this is a rather uninteresting usecase.
We do recognize and support it as it's a valid "last line of defense"
for system and application failure analysis, but we should also put it
all into proper perspective: it's the rare and abnormal exception, not
the design target.
Note that we support this mode of tracing today already: we can
already stream binary data via the ftrace channel - the ring buffer
gives the infrastructure for that. Just do:
# echo bin > /debug/tracing/trace_options
... and you'll get the trace data streamed to user-space in an
efficient, raw, binary data format!
This works here and today - and if you'd like it to become more
efficient within the ftrace framework, we are all for it. (It's
obviously not the default mode of output, because humans prefer ASCII
and scriptable output formats by a _wide_ margin.)
Almost by definition anything opaque and binary-only that goes from
the kernel to user-space has fundamental limitations: it just doesnt
actively interact with the kernel for us to be able to form a useful
and flexible filter of information around it.
The _real_ solution to tracing in 99% of the cases is to intelligently
limit information - it's not like the user will read and parse
gigabytes of data ...
Look at the myriads of rather useful ftrace plugins we have already
and that sprung out of nothing. Compare it to the _10 years_ of
inaction that more static tracing concepts created. Those plugins work
and spread because it all lives and breathes within the kernel, and
almost none of that could be achieved via the 'stream binary data to
user-space' model you are concentrating on.
So in the conceptual space i can see little use for markers in the
kernel that are not tracepoints (i.e. not actively used by a real
tracer). We had markers in the scheduler initially, then we moved to
tracepoints - and tracepoints are much nicer.
[ And you wrote both markers and tracepoints, so it's not like i risk
degenerating this discussion into a flamewar by advocating one of
your solutions over the other one ;-) ]
... and in that sense i'd love to see lttng become a "super ftrace
plugin", and be merged upstream ASAP.
We could even split it up into multiple bits as its merged: for
example syscall tracing would be a nice touch that a couple of other
plugins would adapt as well. But every tracepoint should have some
active role and active connection to a tracer.
And we'd keep all those tracepoints open for external kprobes use as
well - for the dynamic tracers, as a low-cost courtesy. (no long-term
API guarantees though.)
Hm?
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists