[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100120155753.GF5154@csn.ul.ie>
Date: Wed, 20 Jan 2010 15:57:53 +0000
From: Mel Gorman <mel@....ul.ie>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Avi Kivity <avi@...hat.com>, ananth@...ibm.com,
Jim Keniston <jkenisto@...ibm.com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Ingo Molnar <mingo@...e.hu>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
utrace-devel <utrace-devel@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Masami Hiramatsu <mhiramat@...hat.com>,
Maneesh Soni <maneesh@...ibm.com>,
Mark Wielaard <mjw@...hat.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Mon, Jan 18, 2010 at 02:15:51PM +0100, Peter Zijlstra wrote:
> On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
> > On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
> > >
> > >> Well, the alternatives are very unappealing. Emulation and
> > >> single-stepping are going to be very slow compared to a couple of jumps.
> > >>
> > > With CPL2 or RPL on user segments the protection issue seems to be
> > > manageable for running the instructions from kernel space.
> > >
> >
> > CPL2 gives unrestricted access to the kernel address space; and RPL does
> > not affect page level protection. Segment limits don't work on x86-64.
> > But perhaps I missed something - these things are tricky.
>
> So setting RPL to 3 on the user segments allows access to kernel pages
> just fine? How useful.. :/
>
> > It should be possible to translate the instruction into an address space
> > check, followed by the action, but that's still slower due to privilege
> > level switches.
>
> Well, if you manage to do the address validation you don't need the priv
> level switch anymore, right?
>
It also starts becoming very x86-centric though, doesn't it? It might
kick other ports later.
What is there at the moment is storing the copied instructions in a VMA.
The most unpalatable part of that to me is that it's visible to
userspace, probably via /proc/ and I didn't check, but I hope an
munmap() from userspace cannot delete it.
What the VMA has going for it is that it *appears* to be easier to port to
other architectures than the alternatives, certainly easier to handle than
instruction emulation.
> Are the ins encodings sane enough to recognize mem parameters without
> needing to know the actual ins?
>
> How about using a hw-breakpoint to close the gap for the inline single
> step? You could even re-insert the int3 lazily when you need the
> hw-breakpoint again. It would consume one hw-breakpoint register for
> each task/cpu that has probes though..
>
This feels very racy. Along with that, making these sort of changes
was considered a risky venture on x86 and needed strong verification from
elsewhere (http://lkml.org/lkml/2010/1/12/300). There are probably similar
concerns on other architectures that would make a reliable port difficult.
Right now the approach is with VMAs. The alternatives are
1. reserved XOL page (similar disadvantages to the VMA)
2. emulated instructions
This is an emulation bug waiting to happen in my opinion and makes
porting uprobes a significantly more difficult undertaking than
either the XOL-VMA or XOL-page approach
3. XOL page in kernel space available at a different CPL
This assumes all target architectures have a usable privilege
ring which may be the case. However, I would guess that it
is going to perform worse than the current approach because
of the change in privilege level. No idea what the cost of
a privilege level change is, but I doubt it's free
4. Boosted probes (arch-specific, apparently only x86 does this for
kprobes)
As unpalatable as the VMA is, I am failing to see why it's not a
reasonable starting point with an understanding that 2 or 3 would be
implemented in the future after the other architecture ports are in
place and the reliability of the options as well as the performance can
be measured.
There would appear to be two classes of application that might suffer
from the VMA. The first which need absolutly every single ounce of address
space. The second which introspects itself via /proc/self/maps and makes
decisions based on that. The first is unfortunate but should be a limited
number of use cases. The second could be fudged by simply not exporting the
information via /proc.
I'm of the opinion it would be reasonable to let the VMA go ahead, look
at the ports for the other architectures and revisit options 2 and 3 above
to see if the VMA can really be removed with performance or reliability
penalty.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists