lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0903171136090.10274@gandalf.stny.rr.com>
Date:	Tue, 17 Mar 2009 11:41:05 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Christoph Hellwig <hch@...radead.org>
cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Theodore Tso <tytso@....edu>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Mathieu Desnoyers <compudj@...stal.dyndns.org>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	"Martin J. Bligh" <mbligh@...igh.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Larry Woodman <lwoodman@...hat.com>,
	Jason Baron <jbaron@...hat.com>,
	Tom Zanussi <tzanussi@...il.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Jiaying Zhang <jiayingz@...gle.com>,
	Steven Rostedt <srostedt@...hat.com>
Subject: Re: [PATCH 4/7] tracing: new format for specialized trace points


On Tue, 17 Mar 2009, Christoph Hellwig wrote:

> On Tue, Mar 10, 2009 at 12:57:14AM -0400, Steven Rostedt wrote:
> > Here's the example. The only updated macro in this patch is the
> > sched_switch trace point.
> 
> Note that we shouldn't keep two variants around long-term, that's
> just going to cause confusion.
> 
> > The old method looked like this:
> > 
> >  TRACE_EVENT_FORMAT(sched_switch,
> >         TP_PROTO(struct rq *rq, struct task_struct *prev,
> >                 struct task_struct *next),
> >         TP_ARGS(rq, prev, next),
> >         TP_FMT("task %s:%d ==> %s:%d",
> >               prev->comm, prev->pid, next->comm, next->pid),
> >         TRACE_STRUCT(
> >                 TRACE_FIELD(pid_t, prev_pid, prev->pid)
> >                 TRACE_FIELD(int, prev_prio, prev->prio)
> >                 TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
> >                                     next_comm,
> >                                     TP_CMD(memcpy(TRACE_ENTRY->next_comm,
> >                                                  next->comm,
> >                                                  TASK_COMM_LEN)))
> >                 TRACE_FIELD(pid_t, next_pid, next->pid)
> >                 TRACE_FIELD(int, next_prio, next->prio)
> >         ),
> >         TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d")
> >         );
> > 
> > The above method is hard to read and requires two format fields.
> > 
> > The new method:
> > 
> >  /*
> >   * Tracepoint for task switches, performed by the scheduler:
> >   *
> >   * (NOTE: the 'rq' argument is not used by generic trace events,
> >   *        but used by the latency tracer plugin. )
> >   */
> >  TRACE_EVENT(sched_switch,
> > 
> > 	TP_PROTO(struct rq *rq, struct task_struct *prev,
> > 		 struct task_struct *next),
> > 
> > 	TP_ARGS(rq, prev, next),
> > 
> > 	TP_STRUCT__entry(
> > 		__array(	char,	prev_comm,	TASK_COMM_LEN	)
> > 		__field(	pid_t,	prev_pid			)
> > 		__field(	int,	prev_prio			)
> > 		__array(	char,	next_comm,	TASK_COMM_LEN	)
> > 		__field(	pid_t,	next_pid			)
> > 		__field(	int,	next_prio			)
> > 	),
> > 
> > 	TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
> > 		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
> > 		__entry->next_comm, __entry->next_pid, __entry->next_prio),
> > 
> > 	TP_fast_assign(
> > 		memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
> > 		__entry->prev_pid	= prev->pid;
> > 		__entry->prev_prio	= prev->prio;
> > 		memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
> > 		__entry->next_pid	= next->pid;
> > 		__entry->next_prio	= next->prio;
> > 	)
> >  );
> 
> While the idea behing it seems like an improvement to me, the
> implementation feel actually worse than the old one too me.  I would
> expect this to look more like:
> 
> struct trace_sched_switch {
> 	char	prev_comm[TASK_COMM_LEN],
> 	pid_t	prev_pid,
> 	int	prev_prio,
> 	char	next_comm[TASK_COMM_LEN],
> 	pid_t	next_pid,
> 	int	next_prio,
> }

We would love to do the above. The problem is that we also need a way
to automatically export the fields offset/size to userspace. Thus we use
the "__field()" and "__array()" macros to do this for us. Otherwise, we 
need to do that manually.


> 
> static void trace_sched_assign(struct trace_sched_switch *dst, struct rq *rq,
> 		struct task_struct *prev, struct task_struct *next)
> {
> 	memcpy(dst->next_comm, next->comm, TASK_COMM_LEN);
> 	dst->prev_pid	= prev->pid;
> 	dst->prev_prio	= prev->prio;
> 	memcpy(dst->prev_comm, prev->comm, TASK_COMM_LEN);
> 	dst->next_pid	= next->pid;
> 	dst->next_prio	= next->prio;
> };

This we could take out of the macro and make a function.

-- Steve

> 
> 
> TRACE_EVENT(sched_switch,
> 	trace_proto(struct rq *rq, struct task_struct *prev,
>  		    struct task_struct *next),
> 	trace_args(rq, prev, next),
> 	trace_struct(struct trace_sched_switch),
> 	trace_assign(trace_sched_assign);
> 
> 	trace_pretty_print("task %s:%d [%d] ==> %s:%d [%d]",
> 		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
> 		__entry->next_comm, __entry->next_pid, __entry->next_prio),
> );
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ