linux-kernel - Re: [PATCH] add b+tree library

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090111082010.GA30090@logfs.org>
Date:	Sun, 11 Jan 2009 09:20:11 +0100
From:	Jörn Engel <joern@...fs.org>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Theodore Tso <tytso@....edu>,
	Johannes Berg <johannes@...solutions.net>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] add b+tree library

On Sat, 10 January 2009 23:26:56 +0100, Andi Kleen wrote:
> 
> > Finally, are b+tree so much better than rbtrees that it would be
> > worthwhile to just *replace* rbtrees with b+trees?  Or is the problem
> 
> There've been a couple of proposals like this over the years, also
> with other data structures like judy trees (which seem to bring
> the cache line optimization Joern talks about to even more extreme).
> splay trees seem to be also a popular suggestion, although they've
> considered dubious by other people (including me). Another
> alternative would be skip lists. 

The number of people that truly understand what Judy trees do may be
single-digit.  Main disadvantage I see is that Judy trees heavily rely
on repacking nodes over and over.  Part of Judy is a memory manager with
essentially slab caches for nodes with 2, 4, 6, 8, 12, 16, 24, 32, 48,
64, 96, 128, 192, 256, 384 and 512 words.

Splay trees are still binary trees, so the fan-out argument is identical
to that against rbtrees.  If we have to pull in a cacheline, we might as
well use all of it.

Skip lists are just a Bad Idea(tm).  In O(x) notation they behave like
binary trees, waste cachelines left and right, use more memory, depend
on a sufficiently good random() function,...  I guess you never closely
looked at them, because anyone who does tries to forget them as fast as
possible.

> Also I don't remember there was ever a big discussion about
> rbtrees vs other trees -- it was just that Andrea liked
> them and added a convenient library and some point and other
> people found it convenient too and started using it.
> 
> But it's unclear how much all that would really help.
> 
> I always thought it might be advanteous to look at a little
> more abstract interface for the existing rbtree users (shouldn't
> be that hard, the interface is already not too bad) and then just 
> let some students implement a couple of backend data structures
> for that interface and then run some benchmarks.
> 
> I don't think it's a good idea to add a b*tree library
> and use it only from a few users though. After all it's
> a lot of code and that should have a clear benefit.

Quoting Dave Chinner:
| I think a btree library would be useful - there are places where
| people are using rb-trees instead of btree simply because it's
| easier to use the rbtree than it is to implement a btree library.
| I can think of several places I could use such a library for
| in-memory extent representation....

Jörn

-- 
Joern's library part 15:
http://www.knosof.co.uk/cbook/accu06a.pdf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/