[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1264016052.5122.40.camel@localhost.localdomain>
Date: Wed, 20 Jan 2010 11:34:12 -0800
From: Jim Keniston <jkenisto@...ibm.com>
To: Andi Kleen <andi@...stfloor.org>
Cc: Avi Kivity <avi@...hat.com>, Pekka Enberg <penberg@...helsinki.fi>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>, ananth@...ibm.com,
Ingo Molnar <mingo@...e.hu>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
utrace-devel <utrace-devel@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Masami Hiramatsu <mhiramat@...hat.com>,
Maneesh Soni <maneesh@...ibm.com>,
Mark Wielaard <mjw@...hat.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Wed, 2010-01-20 at 19:31 +0100, Andi Kleen wrote:
> Jim Keniston <jkenisto@...ibm.com> writes:
> >
> > I don't know of any such plans, but I'd be interested to read more of
> > your thoughts here. As I understand it, you've suggested replacing the
> > probed instruction with a jump into an instrumentation vma (the XOL
> > area, or something similar). Masami has demonstrated -- through his
> > djprobes enhancement to kprobes -- that this can be done for many x86
> > instructions.
>
> The big problem when doing this in user space is that for 64bit
> it has to be within 2GB of the probed code, otherwise you would
> need to rewrite the instruction to not use any rip relative addressing,
> which can be rather complicated (needs registers, but the instruction
> might already use them, so you would need a register allocator/spilling etc.)
I'm probably telling you stuff you already know, but...
Re: jumps longer than 2GB: The following 14-byte sequence seems to work:
jmpq *(%rip)
.quad next_insn
where next_insn is the address of the instruction to which we want to
jump. We'd need this for boosting, anyway -- to jump from the XOL area
back to the probed instruction stream.
I think djprobes inserts a 5-byte jump at the probepoint; I don't know
whether a 14-byte jump would introduce new difficulties.
Re: rewriting instructions that use rip-relative addressing. We do that
now. See handle_riprel_insn() in patch #2. (As far as we can tell, it
works, but we'd appreciate your review of it.)
>
> And that 2GB can be anywhere in the address space for shared
> libraries, which might well be already used. A lot of programs
> need large VM areas without holes.
>
> Also I personally would be unconfortable to let the instruction
> decoder be used by unpriviledged code. Who knows how
> many buffer overflows it has?
The instruction decoder is used only during instruction analysis, while
registering the probe -- i.e., in kernel space.
>
> In general the trend has been also to make traps faster in the CPU, make
> sure you're not optimizing for some old CPU here.
I won't argue with that. What Avi seems to be proposing buys us a
speedup, but at the cost of increased complexity -- among other things,
splitting the instrumentation code between user space (in the "XOL" area
-- which would then be used for much more than XOL instruction slots)
and kernel space. The splitting would presumably be handled by
higher-level code -- SystemTap, perf, or whatever. It's a neat idea,
but it seems like a v2 kind of feature.
>
> -Andi
Jim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists