[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090308192127.GA5888@elte.hu>
Date: Sun, 8 Mar 2009 20:21:27 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Jiaying Zhang <jiayingz@...gle.com>
Cc: Mathieu Desnoyers <compudj@...stal.dyndns.org>,
Steven Rostedt <rostedt@...dmis.org>,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Theodore Tso <tytso@....edu>,
Arjan van de Ven <arjan@...radead.org>,
Pekka Paalanen <pq@....fi>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Martin Bligh <mbligh@...gle.com>,
"Frank Ch. Eigler" <fche@...hat.com>,
Tom Zanussi <tzanussi@...il.com>,
Masami Hiramatsu <mhiramat@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Jason Baron <jbaron@...hat.com>,
Christoph Hellwig <hch@...radead.org>,
Eduard - Gabriel Munteanu <eduard.munteanu@...ux360.ro>,
mrubin@...gle.com, md@...gle.com
Subject: Re: [PATCH 0/5] [RFC] binary reading of ftrace ring buffers
* Jiaying Zhang <jiayingz@...gle.com> wrote:
> I would like to point out that we think it is really important
> to have some very efficient probing mechanism in the kernel
> for tracing in production environments. The printf and va_arg
> based probes are flexible but less efficient when we want to
> trace high-throughput events. Even function calls can add
> noticeable overhead according to our measurements. So I think
> we need to provide a way (mostly via macro definitions) with
> which a subsystem can enter an event into a trace buffer
> through a short code path. I.e., we should limit the number of
> callbacks and avoid format string parsing.
>
> As I understand, Steven's latest TRACE_FIELD patch avoids such
> overhead, although it does seem to add complexity for adding
> new trace points. [...]
Yeah - it was motivated by the patches you sent to lkml which
showed that it's possible to do it quite sanely and that it can
be done faster.
> [...] It would be nice if we can replace the above
> sched_switch declaration with just a couple of macros.
Good point - there's ongoing work to simplify the TRACE_FIELD
approach. The current (not yet pushed out) optimized tracepoint
format Steve is working on is:
/*
* Tracepoint for task switches, performed by the scheduler:
*
* (NOTE: the 'rq' argument is not used by generic trace events,
* but used by the latency tracer plugin. )
*/
TRACE_EVENT(sched_switch,
TP_PROTO(struct rq *rq, struct task_struct *prev,
struct task_struct *next),
TP_ARGS(rq, prev, next),
TP_STRUCT__entry(
__array( char, prev_comm, TASK_COMM_LEN )
__field( pid_t, prev_pid )
__field( int, prev_prio )
__array( char, next_comm, TASK_COMM_LEN )
__field( pid_t, next_pid )
__field( int, next_prio )
),
TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
__entry->next_comm, __entry->next_pid, __entry->next_prio),
TP_fast_assign(
memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
__entry->prev_pid = prev->pid;
__entry->prev_prio = prev->prio;
memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
__entry->next_pid = next->pid;
__entry->next_prio = next->prio;
)
);
As you can see it enumerates fields, provides format-based
tracing and a tracepoint as well. It also looks quite similar to
C syntax while still being an information-dense macro.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists