linux-kernel - Re: [GIT PULL] tracing: make signal tracepoints more useful

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120120180059.GB2323@redhat.com>
Date:	Fri, 20 Jan 2012 13:01:00 -0500
From:	Jason Baron <jbaron@...hat.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Ingo Molnar <mingo@...e.hu>, Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Seiji Aguchi <saguchi@...hat.com>,
	linux-kernel@...r.kernel.org,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
Subject: Re: [GIT PULL] tracing: make signal tracepoints more useful

On Tue, Jan 17, 2012 at 09:37:49AM -0500, Steven Rostedt wrote:
> On Tue, 2012-01-17 at 13:40 +0100, Ingo Molnar wrote:
> > * Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> > Any tool that requests the signal trace event, and copies the 
> > full (and now larger) record it got in the ring-buffer, without 
> > expanding the target record's size accordingly will *BREAK*. 
> 
> I'm curious to where it gets the size?
> 
> This is not like the kernel writing to a pointer in userspace memory,
> where it can indeed break code by writing too much. This is the
> userspace program writing from a shared memory location.
> 
> 
> > 
> > I do not claim that tools will break in practice - i'm raising 
> > the *possibility* out of caution and i'm frustrated that you 
> > *STILL* don't understand how ABIs are maintained in Linux. 
> 
> You are defending code that would do:
> 
> 	size = read_size(ring_buffer_event);
> 	memcpy(data, buffer, size);
> 
> over code that would most likely do:
> 
> 	memcpy(data, buffer, sizeof(*data));
> 
> ???
> 
> According to this logic, we should never increase the size
> of /proc/stat, because someone might do:
> 
> 	i = 0;
> 	fd = open("/proc/stat", O_RDONLY);
> 	do {
> 		r = read(fd, buff+i, BUFSIZ);
> 		i += r;
> 	} while (r > 0);
> 
> 
> 
> > 
> > You arguing about defined semantics is *MEANINGLESS*. What 
> > matters is what the apps do in practice.
> 
> Exactly, to depend on the ring buffer size to do all copies to fixed
> size data structures seems to be backwards to what would be done in
> practice.
> 
> 
> >  If the apps we know 
> > about do it robustly and adapt (or don't care) about the 
> > expansion, and if no-one reports a regression in tools we don't 
> > know about, then it's probably fine.
> 
> It's not about robustness, it's about the easy way to copy.
> 
> 	memcpy(data, buffer, sizeof(*data));
> 
> wont break.
> 
> 
> > But your argument that expansion is somehow part of the ABI is 
> > patently false and misses the point. Seeing your arguments make 
> > me *very* nervous about applying any ABI affecting patch from 
> > you.
> 
> Well you already think I'm stupid, I wont change the ABI anymore.
> Obviously I know nothing, because I created a flexible interface that's
> not used by anything except perf and trace-cmd, but because there's no
> library, we are stuck with fixed tracepoints, which will come to haunt
> us in the not so distant future.
> 
> This will bloat the kernel. Tracepoints are not free. They bloat the
> kernel's text section. Every tracepoint still adds a bit of code in the
> "unlikely" part inlined where they are called. So they do have an affect
> on icache, as well as the code to process the tracepoint (around 5k per
> tracepoint).
> 

Right, with the jump label optimization, the 'unlikely' branch is
usually moved to the end of the function, with only a single no-op in
the hot-path. However, with gcc enhancement the unlikely label could
be labeled something like 'cold', and moved either further out-of-line.
Its a potential improvement for jump labels, that I need to look into.

Thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/