linux-kernel - Re: linux-next: add utrace tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100127110555.GB1842@in.ibm.com>
Date:	Wed, 27 Jan 2010 16:35:55 +0530
From:	Ananth N Mavinakayanahalli <ananth@...ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	Kyle Moffett <kyle@...fetthome.net>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Fr??d??ric Weisbecker <fweisbec@...il.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Tom Tromey <tromey@...hat.com>,
	"Frank Ch. Eigler" <fche@...hat.com>, linux-next@...r.kernel.org,
	"H. Peter Anvin" <hpa@...or.com>, utrace-devel@...hat.com,
	Thomas Gleixner <tglx@...utronix.de>, avi@...hat.com
Subject: Re: linux-next: add utrace tree

On Wed, Jan 27, 2010 at 11:55:16AM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-27 at 02:43 -0800, Linus Torvalds wrote:
> > 
> > On Wed, 27 Jan 2010, Peter Zijlstra wrote:
> > > 
> > > Right, so you're going to love uprobes, which does exactly that. The
> > > current proposal is overwriting the target instruction with an INT3 and
> > > injecting an extra vma into the target process's address space
> > > containing the original instruction(s) and possible jumps back to the
> > > old code stream.
> > 
> > Just out of interest, how does it handle the threading issue?
> > 
> > Last I saw, at least some CPU people were _very_ nervous about overwriting 
> > instructions if another CPU might be just about to execute them.
> > 
> > Even the "overwrite only the first byte with 'int3'" made them go "umm, I 
> > need to talk to some core CPU people to see if that's ok". They mumble 
> > about possible CPU errata, I$ coherency, instruction retry etc.
> > 
> > I realize kprobes does this very thing, but kprobes is esoteric stuff and 
> > doesn't have much choice. In user space, you _could_ do the modification 
> > on a different physical page and then just switch the page table entry 
> > instead, and not get into the whole D$/I$ coherency thing at all.
> 
> Right, so there's two aspects:
> 
>  1) concurrency when inserting the probe
>  2) concurrency when hitting the probe
> 
> 1) used to be dealt with by using utrace to stop all threads in the
> process and then writing the instruction. I suggested to CoW the page,
> modify the instruction, set the pagetable and flush tlbs at full speed
> -- the very thing you suggest here.
> 
> 2) so traditionally (and the intel arch manual describes this) is to
> replace the instruction, single step it, and write the probe back. This
> is racy for multi-threading. The current uprobes stuff solves this by
> doing single-step-out-of-line (XOL).
> 
> XOL injects a new vma into the target process and puts the old
> instruction there, then it single steps on the new location, leaving the
> original site with INT3.
> 
> This doesn't work for things like RIP relative instructions, so uprobes
> considers them un-probable.

Probing RIP-relative instructions work just fine; there are fixups that
take care of it.

> Also, I myself really object to inserting a vma in a running process,
> its like a land-lord, sure he has the key but he won't come in an poke
> through your things.
> 
> The alternative is to place the instruction in TLS or stack space, since
> each thread can only have a single trap at a time, you only need space
> for 1 instruction (plus a possible jump out to the original site). There
> is the 'problem' of marking the TLS/stack executable when being probed.
> 
> Then there is the whole emulation angle, the uprobes people basically
> say its too much effort to write a x86 emulator.

We don't need to write one. I don't know how easy it is to make the kvm
emulator less kvm-centric (vcpus, kvm_context, etc). Avi?

Ananth 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/