lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 13 Aug 2012 03:37:50 -0700
From:	Michel Lespinasse <walken@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	riel@...hat.com, vrajesh@...ch.edu, daniel.santos@...ox.com,
	aarcange@...hat.com, dwmw2@...radead.org,
	akpm@...ux-foundation.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org
Subject: Re: [PATCH 0/5] rbtree based interval tree as a prio_tree replacement

On Mon, Aug 13, 2012 at 1:20 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, 2012-08-07 at 00:25 -0700, Michel Lespinasse wrote:
>> a faster worst-case complexity of O(k+log N) for stabbing queries in a
>> well-balanced prio tree, vs O(k*log N) for interval trees (where k=number
>> of matches, N=number of intervals). Now this sounds great, but in practice
>> prio trees don't realize this theorical benefit. First, the additional
>> constraint makes them harder to update, so that the kernel implementation
>> has to simplify things by balancing them like a radix tree, which is not
>> always ideal.
>
> Not something spending a great deal of time on, but do you have any idea
> what the radix like balancing does the the worst case stabbing
> complexity?

Well, it's not terribly bad as the height of the radix tree is still
bounded by (IIRC) log(ULONG_MAX) + log(max pgoff_t in the file). So
the complexity ends up being O(k + some constant) where the constant
is in practice always larger than log(N).

One can still construct cases where a prio tree stabbing query would
read less nodes (so touch less cache lines) than with augmented
rbtree. I think the best way to explain this is to look at what
augmented rbtree does - for a stabbing query returning k matches, it
visits all nodes between the root and the k matching nodes (plus one
extra path at the end to conclude that there are no further matches).
So the worst case there would be if the matching nodes are all close
to leaf level in the tree, and if the paths leading to them branch up
close to the root level. It's easy to construct such a case by making
all N nodes in the tree start to the left of the stabbing query, and
having only an arbitrary k among them end to the right of the query.
Prio tree achieves a better worst case complexity by placing these k
nodes near the top of the tree (due to the heap constraint). But this
worst case situation isn't very likely to begin with, and in the
typical case the k paths leading to the matching nodes in an augmented
rbtree don't branch up so close to the root so typically prio tree
doesn't get an advantage there.

To sum it up, I would say that both algorithms are really pretty good,
with prio tree getting a slightly better theorical worst-case and
augmented rbtree getting (IMO) a practical advantage due to lower
constant factors and a still very acceptable worst case.

> Anyway, I like the thing, that prio-tree code always made my head hurt.

Black magic I tell you ;-)

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ