[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B60067B.4060708@redhat.com>
Date: Wed, 27 Jan 2010 11:25:15 +0200
From: Avi Kivity <avi@...hat.com>
To: Ingo Molnar <mingo@...e.hu>
CC: Peter Zijlstra <peterz@...radead.org>,
Jim Keniston <jkenisto@...ibm.com>,
Pekka Enberg <penberg@...helsinki.fi>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
ananth@...ibm.com, Arnaldo Carvalho de Melo <acme@...radead.org>,
utrace-devel <utrace-devel@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Masami Hiramatsu <mhiramat@...hat.com>,
Maneesh Soni <maneesh@...ibm.com>,
Mark Wielaard <mjw@...hat.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On 01/27/2010 11:08 AM, Ingo Molnar wrote:
>
>> I see it exactly the opposite. Only a very small minority of cases will
>> have such severe memory corruption that tracing will fall apart because of
>> random writes to memory; especially on 64-bit where the address space is
>> sparse. On the other hand, knowing that the cost is a few dozen cycles
>> rather than a thousand or so means that you can trace production servers
>> running full loads without worrying about whether tracing will affect
>> whatever it is you're trying to observe.
>>
>> I'm not against slow reliable tracing, but we shouldn't ignore the need for
>> speed.
>>
> I havent seen a conscise summary of your points in this thread, so let me
> summarize it as i've understood them (hopefully not putting words into your
> mouth): AFAICS you are arguing for some crazy fragile architecture-specific
> solution that traps INT3 into ring3 just to shave off a few cycles, and then
> use user-space state to trace into.
>
That's a good summary, except for the words "crazy fragile", "trap INT3
into ring3" and "a few cycles".
Instead of using int 3, put a jump instruction in the program. This
shaves a lot more than a few cycles.
> If so then you ignore the obvious solution to _that_ problem: dont use INT3 at
> all, but rebuild (or re-JIT) your program with explicit callbacks. It's _MUCH_
> faster than _any_ breakpoint based solution - literally just the cost of a
> function call (or not even that - i've written very fast inlined tracers -
> they do rock when it comes to performance). Problem solved and none of the
> INT3 details matters at all.
>
However did I not think of that? Yes, and let's rip off kprobes tracing
from the kernel, we can always rebuild it.
Well, I'm observing an issue in a production system now. I may not want
to take it down, or if I take it down I may not be able to observe it
again as the problem takes a couple of days to show up, or I may not
have the full source, or it takes 10 minutes to build and so an
iterative edit/build/run cycle can stretch for hours.
Adding a vma to a running program is very unlikely to affect it. If the
program makes random accesses to memory, it will likely segfault very
quickly before we ever get to trace it.
> INT3 only matters to _transparent_ probing, and for that, the cost of INT3 is
> almost _by definition_ less important than the fact that we can do transparent
> tracing. If performance were the overriding issue they'd use dedicated
> callbacks - and the INT3 technique wouldnt matter at all.
>
INT3 isn't transparent. The only thing that comes close to full
transparency is hardware breakpoints. So we have a tradeoff between
transparency and speed, and except for the wierdest bugs, this level of
transparency won't be needed.
> ( Also, just like we were able to extend the kprobes code with more and more
> optimizations, the same can be done with any user-space probing as well, to
> make it faster. But at the core of it has to be a sane design that is
> transparent and controlled by the kernel, so that it has the option to apply
> more and more otimizations - yours isnt such and its limitations are
> designed-in.
No design is fully transparent, and I don't see why my design can't be
controlled by the kernel?
> Which is neither smart nor useful. )
>
This style of arguing is neither smart or useful as well.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists