[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100116184833.2s0zihwbggkgccsk@imap.linux.ibm.com>
Date: Sat, 16 Jan 2010 18:48:33 -0500
From: Jim Keniston <jkenisto@...ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Ingo Molnar <mingo@...e.hu>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
utrace-devel <utrace-devel@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Masami Hiramatsu <mhiramat@...hat.com>,
Maneesh Soni <maneesh@...ibm.com>,
Mark Wielaard <mjw@...hat.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
Quoting Peter Zijlstra <peterz@...radead.org>:
> On Fri, 2010-01-15 at 16:58 -0800, Jim Keniston wrote:
>> But here are some things to keep in mind about the
>> various approaches:
>>
>> 1. Single-stepping inline is easiest: you need to know very little about
>> the instruction set you're probing. But it's inadequate for
>> multithreaded apps.
>> 2. Single-stepping out of line solves the multithreading issue (as do #3
>> and #4), but requires more knowledge of the instruction set. (In
>> particular, calls, jumps, and returns need special care; as do
>> rip-relative instructions in x86_64.) I count 9 architectures that
>> support kprobes. I think most of these do SSOL.
>> 3. "Boosted" probes (where an appended jump instruction removes the need
>> for the single-step trap on many instructions) require even more
>> knowledge of the instruction set, and like SSOL, require XOL slots.
>> Right now, as far as I know, x86 is the only architecture with boosted
>> kprobes.
>> 4. Emulation removes the need for the XOL area, but requires pretty much
>> total knowledge of the instruction set. It's also a performance win for
>> architectures that can't do #3. I see kvm implemented on 4
>> architectures (ia64, powerpc, s390, x86). Coincidentally, those are the
>> architectures to which uprobes (old uprobes, with ubp and xol bundled
>> in) has already been ported (though Intel hasn't been maintaining their
>> ia64 port).
>
> Right, so I was thinking a combination of 4 and execute from kernel
> space would be feasible. I would think most regular instructions are
> runnable from kernel space given that we provide the proper pt_regs
> environment.
>
> Although I just realize we need to fully emulate the address computation
> step for all memory writes, otherwise a wild userspace pointer might end
> up writing in your kernel image.
Correct.
>
> Also, don't we already need full knowledge of the instruction set in
> order to decode the instruction stream and find instruction boundaries.
Not really. For #3 (boosting), you need to know everything for #2,
plus be able to compute the length of each instruction -- which we can
now do for x86. To emulate an instruction (#4), you need to replicate
what it does, side-effects and all. The x86 instruction set seems to
be adding new floating-point instructions all the time, and I bet even
Masami doesn't know what they all do, but so far, they all seem to
adhere to the instruction-length rules encoded in Masami's instruction
decoder.
As you may have noted before, I think FP would be a special problem
for your approach. I'm not sure how folks would react to the idea of
executing FP instructions in kernel space. But emulating them is also
tough. There's an IEEE FP emulation package somewhere in one of the
Linux arch directories, but I'm not sure how precise it is, and
dropping even 1 bit of precision is unacceptable for many
applications, since such errors tend to grow in complex computations
employing many FP instructions.
Jim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists