lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0903061124400.23248@gandalf.stny.rr.com>
Date:	Fri, 6 Mar 2009 11:59:21 -0500 (EST)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Theodore Tso <tytso@....edu>,
	Arjan van de Ven <arjan@...radead.org>,
	Pekka Paalanen <pq@....fi>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Martin Bligh <mbligh@...gle.com>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Tom Zanussi <tzanussi@...il.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Jason Baron <jbaron@...hat.com>,
	Christoph Hellwig <hch@...radead.org>,
	Jiaying Zhang <jiayingz@...gle.com>,
	Eduard - Gabriel Munteanu <eduard.munteanu@...ux360.ro>,
	mrubin@...gle.com, md@...gle.com
Subject: Re: [PATCH 0/5] [RFC] binary reading of ftrace ring buffers


On Wed, 4 Mar 2009, Mathieu Desnoyers wrote:
> > 
> > From previous patches, we have in include/trace/sched_event_types.h:
> > 
> > #undef TRACE_SYSTEM
> > #define TRACE_SYSTEM sched
> > 
> > TRACE_EVENT_FORMAT(sched_switch,
> > 	TPPROTO(struct rq *rq, struct task_struct *prev,
> > 		struct task_struct *next),
> > 	TPARGS(rq, prev, next),
> > 	TPFMT("task %s:%d ==> %s:%d",
> > 	      prev->comm, prev->pid, next->comm, next->pid),
> > 	TRACE_STRUCT(
> > 		TRACE_FIELD(pid_t, prev_pid, prev->pid)
> > 		TRACE_FIELD(int, prev_prio, prev->prio)
> > 		TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
> > 				    next_comm,
> > 				    TPCMD(memcpy(TRACE_ENTRY->next_comm,
> > 						 next->comm,
> > 						 TASK_COMM_LEN)))
> > 		TRACE_FIELD(pid_t, next_pid, next->pid)
> > 		TRACE_FIELD(int, next_prio, next->prio)
> > 	),
> > 	TPRAWFMT("prev %d:%d ==> next %s:%d:%d")
> > 	);
> > 
> 
> I fear that putting these user-visible format strings in tracepoint
> header files will create a big maintainability issue.
> 
> I'll post the LTTng patchset in a jiffy, where the format string
> awareness is done in a tracer-specific module. I don't understand why
> Peter Z. is not yelling against your tracepoints modifications : they
> are actually presenting to userspace an interface that is meant to
> eventually change.
> 
> I used a separate layer for format string presentation for this very
> purpose : I don't want to tie the kernel code instrumentation
> (tracepoints) to any kind of user-visible API.
> 

Hi Mathieu,

Sorry for the late reply, I started to reply but was distracted, and sadly 
forgot about it :-(  (and I rebooted the box that had the reply window 
open)


Just a couple of points to make.

1)  The raw format string is a "hint" for userspace to read the buffers.
The offset and sizeof is the real data that userspace should use. The
print format can be just a way to help them out. But it is by no means 
something that is expected to be constant.

It is also used internal by ftrace because it needs a way to read the 
"fast C style recording" records. It is better to print some text than to 
have a bunch of hex print out in the screen. Say you have the C style 
recording enabled and you take an oops. With ftrace_dumps_on_oops set, the 
buffers will be printed to the console.  This lets ftrace have a means to 
print out the events.

2) The difference between this and markers is that the format string is 
bound to the declaration in the trace header. Not in the code itself. The 
maintainer should only see the trace point function call. It should only 
look like a function call and not some funky printf string that the 
maintainer might not care about.

If the code changes and it breaks the format string, it is up to the 
tracing maintainers to fix it. If the format string was embedded within 
the code, then the maintainer would be burdoned with that duty. But here 
the format string is in the include/trace directory and any breakage there 
should definitely be fixed by a some tracing maintainer. Not the 
maintainer of the subsystem. Unless of course that maintainer wants to fix 
it ;-)


3) TRACE_EVENT_FORMAT is an extension to TRACE_FORMAT. Developers may 
choose to use the more simple, but a bit more expensive TRACE_FORMAT if 
they choose to. Perhaps it may be more prudent to anyway. Recently
Peter Zijlstra was tracing locks and just wanted to trace the name. 
There's really no difference in copying the name via a C style (strcpy) or 
having a printf formatting do the work. Here is a case where just using 
TRACE_FORMAT is a reasonable solution.

4) I actually avoided moving the format string into the tracing utility. 
The tracing utility can ignore it if it wants. I orginially had the 
headers included in the kernel/trace/events.c file and found that was too 
cumbersome. I did not like the fact that I needed to go into some tracing 
infrastructure to add an event.  It was a burdon to me to add a new event 
and I wrote the code!  That's why I moved it all into the include/trace 
directory. One stop shopping.

A developer to add new events only needs to add the trace points in their 
code and then update the files (perhaps add new ones) in include/trace. No 
need to dig through the kernel source to find out where else they need to 
go.


I see that you published your patch queue. I've looked at most of the 
patches already. I'll make my comments shortly.

Thanks,

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ