lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 17 Jan 2012 09:37:49 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Seiji Aguchi <saguchi@...hat.com>,
	linux-kernel@...r.kernel.org,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
Subject: Re: [GIT PULL] tracing: make signal tracepoints more useful

On Tue, 2012-01-17 at 13:40 +0100, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@...dmis.org> wrote:

> Any tool that requests the signal trace event, and copies the 
> full (and now larger) record it got in the ring-buffer, without 
> expanding the target record's size accordingly will *BREAK*. 

I'm curious to where it gets the size?

This is not like the kernel writing to a pointer in userspace memory,
where it can indeed break code by writing too much. This is the
userspace program writing from a shared memory location.


> 
> I do not claim that tools will break in practice - i'm raising 
> the *possibility* out of caution and i'm frustrated that you 
> *STILL* don't understand how ABIs are maintained in Linux. 

You are defending code that would do:

	size = read_size(ring_buffer_event);
	memcpy(data, buffer, size);

over code that would most likely do:

	memcpy(data, buffer, sizeof(*data));

???

According to this logic, we should never increase the size
of /proc/stat, because someone might do:

	i = 0;
	fd = open("/proc/stat", O_RDONLY);
	do {
		r = read(fd, buff+i, BUFSIZ);
		i += r;
	} while (r > 0);



> 
> You arguing about defined semantics is *MEANINGLESS*. What 
> matters is what the apps do in practice.

Exactly, to depend on the ring buffer size to do all copies to fixed
size data structures seems to be backwards to what would be done in
practice.


>  If the apps we know 
> about do it robustly and adapt (or don't care) about the 
> expansion, and if no-one reports a regression in tools we don't 
> know about, then it's probably fine.

It's not about robustness, it's about the easy way to copy.

	memcpy(data, buffer, sizeof(*data));

wont break.


> But your argument that expansion is somehow part of the ABI is 
> patently false and misses the point. Seeing your arguments make 
> me *very* nervous about applying any ABI affecting patch from 
> you.

Well you already think I'm stupid, I wont change the ABI anymore.
Obviously I know nothing, because I created a flexible interface that's
not used by anything except perf and trace-cmd, but because there's no
library, we are stuck with fixed tracepoints, which will come to haunt
us in the not so distant future.

This will bloat the kernel. Tracepoints are not free. They bloat the
kernel's text section. Every tracepoint still adds a bit of code in the
"unlikely" part inlined where they are called. So they do have an affect
on icache, as well as the code to process the tracepoint (around 5k per
tracepoint).

People are adding tracepoints all over the kernel without much thought.
We take much more care when we add a system call. By making tracepoints
have as strict a ABI as system calls will cause a maintenance nightmare
in the future. I guarantee that!


           compiled in my box       in include/trace/events  in rest of kernel
In v3.0:         250                        259                  330
In v3.1:         311                        349                  338
in v3.2:         335                        393                  351
latest Linus:    340                        401                  345

(Interesting that tracepoints have disappeared outside the events
directory)

Note, these do not include the syscall tracepoints.

There's already more tracepoints (compiled for my box) than syscalls.

Tracepoints are being added like the US deficit. We need to set some
rules somewhere. Either by making a library that can handle small
changes (like the one we are discussing, even though a memcpy should
cope), or we need to put a kabosh to adding new tracepoints like they
are the new fad app. Perhaps we should put the same requirements on new
tracepoints as we do with new syscalls.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ