[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100127110555.GB1842@in.ibm.com>
Date: Wed, 27 Jan 2010 16:35:55 +0530
From: Ananth N Mavinakayanahalli <ananth@...ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Kyle Moffett <kyle@...fetthome.net>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Fr??d??ric Weisbecker <fweisbec@...il.com>,
Oleg Nesterov <oleg@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>,
Tom Tromey <tromey@...hat.com>,
"Frank Ch. Eigler" <fche@...hat.com>, linux-next@...r.kernel.org,
"H. Peter Anvin" <hpa@...or.com>, utrace-devel@...hat.com,
Thomas Gleixner <tglx@...utronix.de>, avi@...hat.com
Subject: Re: linux-next: add utrace tree
On Wed, Jan 27, 2010 at 11:55:16AM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-27 at 02:43 -0800, Linus Torvalds wrote:
> >
> > On Wed, 27 Jan 2010, Peter Zijlstra wrote:
> > >
> > > Right, so you're going to love uprobes, which does exactly that. The
> > > current proposal is overwriting the target instruction with an INT3 and
> > > injecting an extra vma into the target process's address space
> > > containing the original instruction(s) and possible jumps back to the
> > > old code stream.
> >
> > Just out of interest, how does it handle the threading issue?
> >
> > Last I saw, at least some CPU people were _very_ nervous about overwriting
> > instructions if another CPU might be just about to execute them.
> >
> > Even the "overwrite only the first byte with 'int3'" made them go "umm, I
> > need to talk to some core CPU people to see if that's ok". They mumble
> > about possible CPU errata, I$ coherency, instruction retry etc.
> >
> > I realize kprobes does this very thing, but kprobes is esoteric stuff and
> > doesn't have much choice. In user space, you _could_ do the modification
> > on a different physical page and then just switch the page table entry
> > instead, and not get into the whole D$/I$ coherency thing at all.
>
> Right, so there's two aspects:
>
> 1) concurrency when inserting the probe
> 2) concurrency when hitting the probe
>
> 1) used to be dealt with by using utrace to stop all threads in the
> process and then writing the instruction. I suggested to CoW the page,
> modify the instruction, set the pagetable and flush tlbs at full speed
> -- the very thing you suggest here.
>
> 2) so traditionally (and the intel arch manual describes this) is to
> replace the instruction, single step it, and write the probe back. This
> is racy for multi-threading. The current uprobes stuff solves this by
> doing single-step-out-of-line (XOL).
>
> XOL injects a new vma into the target process and puts the old
> instruction there, then it single steps on the new location, leaving the
> original site with INT3.
>
> This doesn't work for things like RIP relative instructions, so uprobes
> considers them un-probable.
Probing RIP-relative instructions work just fine; there are fixups that
take care of it.
> Also, I myself really object to inserting a vma in a running process,
> its like a land-lord, sure he has the key but he won't come in an poke
> through your things.
>
> The alternative is to place the instruction in TLS or stack space, since
> each thread can only have a single trap at a time, you only need space
> for 1 instruction (plus a possible jump out to the original site). There
> is the 'problem' of marking the TLS/stack executable when being probed.
>
> Then there is the whole emulation angle, the uprobes people basically
> say its too much effort to write a x86 emulator.
We don't need to write one. I don't know how easy it is to make the kvm
emulator less kvm-centric (vcpus, kvm_context, etc). Avi?
Ananth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists