linux-kernel - Re: [PATCH 0/5] [RFC] binary reading of ftrace ring buffers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20090308192127.GA5888@elte.hu>
Date:	Sun, 8 Mar 2009 20:21:27 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Jiaying Zhang <jiayingz@...gle.com>
Cc:	Mathieu Desnoyers <compudj@...stal.dyndns.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Theodore Tso <tytso@....edu>,
	Arjan van de Ven <arjan@...radead.org>,
	Pekka Paalanen <pq@....fi>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Martin Bligh <mbligh@...gle.com>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Tom Zanussi <tzanussi@...il.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Jason Baron <jbaron@...hat.com>,
	Christoph Hellwig <hch@...radead.org>,
	Eduard - Gabriel Munteanu <eduard.munteanu@...ux360.ro>,
	mrubin@...gle.com, md@...gle.com
Subject: Re: [PATCH 0/5] [RFC] binary reading of ftrace ring buffers


* Jiaying Zhang <jiayingz@...gle.com> wrote:

> I would like to point out that we think it is really important 
> to have some very efficient probing mechanism in the kernel 
> for tracing in production environments. The printf and va_arg 
> based probes are flexible but less efficient when we want to 
> trace high-throughput events. Even function calls can add 
> noticeable overhead according to our measurements. So I think 
> we need to provide a way (mostly via macro definitions) with 
> which a subsystem can enter an event into a trace buffer 
> through a short code path. I.e., we should limit the number of 
> callbacks and avoid format string parsing.
> 
> As I understand, Steven's latest TRACE_FIELD patch avoids such 
> overhead, although it does seem to add complexity for adding 
> new trace points. [...]

Yeah - it was motivated by the patches you sent to lkml which 
showed that it's possible to do it quite sanely and that it can 
be done faster.

> [...] It would be nice if we can replace the above 
> sched_switch declaration with just a couple of macros.

Good point - there's ongoing work to simplify the TRACE_FIELD 
approach. The current (not yet pushed out) optimized tracepoint 
format Steve is working on is:

/*
 * Tracepoint for task switches, performed by the scheduler:
 *
 * (NOTE: the 'rq' argument is not used by generic trace events,
 *        but used by the latency tracer plugin. )
 */
TRACE_EVENT(sched_switch,

	TP_PROTO(struct rq *rq, struct task_struct *prev,
		 struct task_struct *next),

	TP_ARGS(rq, prev, next),

	TP_STRUCT__entry(
		__array(	char,	prev_comm,	TASK_COMM_LEN	)
		__field(	pid_t,	prev_pid			)
		__field(	int,	prev_prio			)
		__array(	char,	next_comm,	TASK_COMM_LEN	)
		__field(	pid_t,	next_pid			)
		__field(	int,	next_prio			)
	),

	TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
		__entry->next_comm, __entry->next_pid, __entry->next_prio),

	TP_fast_assign(
		memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
		__entry->prev_pid	= prev->pid;
		__entry->prev_prio	= prev->prio;
		memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
		__entry->next_pid	= next->pid;
		__entry->next_prio	= next->prio;
	)
);

As you can see it enumerates fields, provides format-based 
tracing and a tracepoint as well. It also looks quite similar to 
C syntax while still being an information-dense macro.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/