[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20060914223607.GB25004@elte.hu>
Date: Fri, 15 Sep 2006 00:36:07 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Martin Bligh <mbligh@...igh.org>
Cc: Roman Zippel <zippel@...ux-m68k.org>,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
linux-kernel@...r.kernel.org,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@...l.org>, Ingo Molnar <mingo@...hat.com>,
Greg Kroah-Hartman <gregkh@...e.de>,
Thomas Gleixner <tglx@...utronix.de>,
Tom Zanussi <zanussi@...ibm.com>, ltt-dev@...fik.org,
Michel Dagenais <michel.dagenais@...ymtl.ca>, fche@...hat.com
Subject: Re: [PATCH 0/11] LTTng-core (basic tracing infrastructure) 0.5.108
* Martin Bligh <mbligh@...igh.org> wrote:
> > i very much agree that they should become as fast as possible. So to
> > rephrase the question: can we make dynamic tracepoints as fast (or
> > nearly as fast) as static tracepoints? If yes, should we care about
> > static tracers at all?
>
> Depends how many nops you're willing to add, I guess. Anything, even
> the static tracepoints really needs at least a branch to be useful,
> IMHO. At least for what I've been doing with it, you need to stop the
> data flow after a while (when the event you're interested in happens,
> I'm using it like a flight data recorder, so we can go back and do
> postmortem on what went wrong). I should imagine branch prediction
> makes it very cheap on most modern CPUs, but don't have hard data to
> hand.
only 5 bytes of NOP are needed by default, so that a kprobe can insert a
call/callq instruction. The easiest way in practice is to insert a
_single_, unconditional function call that is patched out to NOPs upon
its first occurance (doing this is not a performance issue at all). That
way the only cost is the NOP and the function parameter preparation
side-effects. (which might or might not be significant - with register
calling conventions and most parameters being readily available it
should be small.)
note that such a limited, minimally invasive 'data extraction point'
infrastructure is not actually what the LTT patches are doing. It's not
even close, and i think you'll be surprised. Let me quote from the
latest LTT patch (patch-2.6.17-lttng-0.5.108, which is the same version
submitted to lkml - although no specific tracepoints were submitted):
+/* Event wakeup logging function */
+static inline void trace_process_wakeup(
+ unsigned int lttng_param_pid,
+ int lttng_param_state)
+#if (!defined(CONFIG_LTT) || !defined(CONFIG_LTT_FACILITY_PROCESS))
+{
+}
+#else
+{
+ unsigned int index;
+ struct ltt_channel_struct *channel;
+ struct ltt_trace_struct *trace;
+ void *transport_data;
+ char *buffer = NULL;
+ size_t real_to_base = 0; /* The buffer is allocated on arch_size alignment */
+ size_t *to_base = &real_to_base;
+ size_t real_to = 0;
+ size_t *to = &real_to;
+ size_t real_len = 0;
+ size_t *len = &real_len;
+ size_t reserve_size;
+ size_t slot_size;
+ size_t align;
+ const char *real_from;
+ const char **from = &real_from;
+ u64 tsc;
+ size_t before_hdr_pad, after_hdr_pad, header_size;
+
+ if(ltt_traces.num_active_traces == 0) return;
+
+ /* For each field, calculate the field size. */
+ /* size = *to_base + *to + *len */
+ /* Assume that the padding for alignment starts at a
+ * sizeof(void *) address. */
+
+ *from = (const char*)<tng_param_pid;
+ align = sizeof(unsigned int);
+
+ if(*len == 0) {
+ *to += ltt_align(*to, align); /* align output */
+ } else {
+ *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+ }
+
+ *len += sizeof(unsigned int);
+
+ *from = (const char*)<tng_param_state;
+ align = sizeof(int);
+
+ if(*len == 0) {
+ *to += ltt_align(*to, align); /* align output */
+ } else {
+ *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+ }
+
+ *len += sizeof(int);
+
+ reserve_size = *to_base + *to + *len;
+ preempt_disable();
+ ltt_nesting[smp_processor_id()]++;
+ index = ltt_get_index_from_facility(ltt_facility_process_2905B6EB,
+ event_process_wakeup);
+
+ list_for_each_entry_rcu(trace, <t_traces.head, list) {
+ if(!trace->active) continue;
+
+ channel = ltt_get_channel_from_index(trace, index);
+
+ slot_size = 0;
+ buffer = ltt_reserve_slot(trace, channel, &transport_data,
+ reserve_size, &slot_size, &tsc,
+ &before_hdr_pad, &after_hdr_pad, &header_size);
+ if(!buffer) continue; /* buffer full */
+
+ *to_base = *to = *len = 0;
+
+ ltt_write_event_header(trace, channel, buffer,
+ ltt_facility_process_2905B6EB, event_process_wakeup,
+ reserve_size, before_hdr_pad, tsc);
+ *to_base += before_hdr_pad + after_hdr_pad + header_size;
+
+ *from = (const char*)<tng_param_pid;
+ align = sizeof(unsigned int);
+
+ if(*len == 0) {
+ *to += ltt_align(*to, align); /* align output */
+ } else {
+ *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+ }
+
+ *len += sizeof(unsigned int);
+
+ /* Flush pending memcpy */
+ if(*len != 0) {
+ memcpy(buffer+*to_base+*to, *from, *len);
+ *to += *len;
+ *len = 0;
+ }
+
+ *from = (const char*)<tng_param_state;
+ align = sizeof(int);
+
+ if(*len == 0) {
+ *to += ltt_align(*to, align); /* align output */
+ } else {
+ *len += ltt_align(*to+*len, align); /* alignment, ok to do a memcpy of it */
+ }
+
+ *len += sizeof(int);
+
+ /* Flush pending memcpy */
+ if(*len != 0) {
+ memcpy(buffer+*to_base+*to, *from, *len);
+ *to += *len;
+ *len = 0;
+ }
+
+ ltt_commit_slot(channel, &transport_data, buffer, slot_size);
+
+ }
+
+ ltt_nesting[smp_processor_id()]--;
+ preempt_enable_no_resched();
+}
+#endif //(!defined(CONFIG_LTT) || !defined(CONFIG_LTT_FACILITY_PROCESS))
+
believe it or not, this is inlined into: kernel/sched.c ...
'enuff said. LTT is so far from being even considerable that it's not
even funny.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists