[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120118080103.GA2889@moon>
Date: Wed, 18 Jan 2012 12:01:03 +0400
From: Cyrill Gorcunov <gorcunov@...il.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: "H. Peter Anvin" <hpa@...or.com>,
Alexey Dobriyan <adobriyan@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Pavel Emelyanov <xemul@...allels.com>,
Andrey Vagin <avagin@...nvz.org>, Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Glauber Costa <glommer@...allels.com>,
Andi Kleen <andi@...stfloor.org>, Tejun Heo <tj@...nel.org>,
Matt Helsley <matthltc@...ibm.com>,
Pekka Enberg <penberg@...nel.org>,
Eric Dumazet <eric.dumazet@...il.com>,
Vasiliy Kulikov <segoon@...nwall.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Valdis.Kletnieks@...edu
Subject: Re: [RFC] syscalls, x86: Add __NR_kcmp syscall
On Tue, Jan 17, 2012 at 01:35:00PM -0800, Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@...or.com> writes:
>
> > On 01/17/2012 06:44 AM, Cyrill Gorcunov wrote:
> >> On Tue, Jan 17, 2012 at 04:38:14PM +0200, Alexey Dobriyan wrote:
> >>> On 1/17/12, Cyrill Gorcunov <gorcunov@...il.com> wrote:
> >>>> +#define KCMP_EQ 0
> >>>> +#define KCMP_LT 1
> >>>> +#define KCMP_GT 2
> >>>
> >>> LT and GT are meaningless.
> >>>
> >>
> >> I found symbolic names better than open-coded values. But sure,
> >> if this is problem it could be dropped.
> >>
> >> Or you mean that in general anything but 'equal' is useless?
> >>
> >
> > Why on Earth would user space need to know which order in memory certain
> > kernel objects are?
>
> For checkpoint restart and for some other kinds of introspection what is
> needed is a comparison function to see if two processes share the same
> object. The most interesting of these objects from a checkpoint restart case
> are file descriptors, and there can be a lot of file descriptors.
>
> The order in memory does not matter. What does matter is that the
> comparison function return some ordering between objects. The algorithm
> for figuring out of N items which of them are duplicates is O(N^2) if
> the comparison function can only return equal or not equal. The
> algorithm for finding duplications is only O(NlogN) if the comparison
> function will return an ordering among the objects.
>
Yes, thanks Eric, I missed this text in patch description, my bad. And
yes, performance will degrade with plain eq/ne approach. But as Pavel
stated in another email
| We can compare the e.g. files' target inodes (ino + dev) and positions and
| comparing each-to-each only for those having these pairs equal. Looking at
| the existing large containers with tens thousands of fd-s we have this
| gives us maximum 6 files to compare, and performing 15 syscalls for this suits
| us for now.
> > Keep in mind that this is *exactly* the kind of information which makes
> > rootkits easier.
>
> I would be very surprised if basic in memory ordering information was
> not already available from simple creation ordering.
>
I think Peter means the scenario where we say have some bug in slab/slub
code which happens on say some Nth allocation and attacker somehow reveal
at least one memory address of struct file, then using such syscall an
attacker might inspect a series of fd (and associated struct file) and guess
which addresses the rest of "struct file" are. In most cases this wont help
(if a system is under more/less high load and open/close files fast enough
'cause "struct file" comes from kmem caches) but on some non-heavy loaded
machine this might do a trick and narrow addresses (if say there only 10
fds which allocated from cache in a row and you somehow know address of
one associated struct file).
In short -- I don't know if it's indeed really serious issue or not
(since from my POV it'll require at least a couple of bugs in a row
to happen before the attacker might use this information). OTOH, shit
happens exactly in 'impossible' scenarios ;)
> If using the in memory ordering is a problem in practice there are a lot
> of other possible ways to order the kernel objects. Allocating sequence
> numbers for the kernel objects, passing the pointers through a
> cryptographically secure hash before comparing them, etc.
>
We've been trying this already ;)
> It does look like Cyrill's patch description lacked the important bit of
> information about the algorithm complexity requiring an ordering among
> kernel objects. Cyrill you probably want to describe more prominently
> what is happening now and why in your patch description rather than give
> the history of different approaches.
>
Yeah, i'll write detailed change log, gimme some time. Thanks Eric!
Btw, extending this syscall to lt/ge variant will be easy, so this is
not a problem I think. At moment we guarantee to return 0/1 on succes,
and < 0 on error, so if we start returing 2/3 in a sake of ordering
the applications which were using only 0/1 values wont crash (if they
are not crappy written ones).
Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists