linux-kernel - Re: [Bcache v13 12/16] bcache: Bset code (lookups within a btree node)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120524181116.GF27983@google.com>
Date:	Thu, 24 May 2012 11:11:16 -0700
From:	Tejun Heo <tj@...nel.org>
To:	Kent Overstreet <koverstreet@...gle.com>
Cc:	linux-bcache@...r.kernel.org, linux-kernel@...r.kernel.org,
	dm-devel@...hat.com, agk@...hat.com
Subject: Re: [Bcache v13 12/16] bcache: Bset code (lookups within a btree
 node)

Hello, Kent.

On Wed, May 23, 2012 at 01:49:14PM -0700, Tejun Heo wrote:
> On Wed, May 23, 2012 at 10:55:44AM -0700, Tejun Heo wrote:
> > How much benefit are we gaining by doing this float thing?  Any chance
> > we can do something simpler?  As long as it's properly encapsulated,
> > it shouldn't be too bad but I keep having the suspicion that a lot of
> > complexity is added for unnecessary level of optimization.
> 
> e.g. how much performance gain does it provide compared to just using
> u64 in binary search tree combined with last search result hint for
> sequential accesses?

I've been thinking a bit more about it last night and have one more
question, so the use of floating numbers in binary search tree is to
use less memory and thus lower cacheline overhead on lookups, and the
tree is built between the min and max values of the bset so that
cluster of large keys don't get lower resolution from high exponent.

Assuming resolution of 512k - 2^19, 32bit can cover 2^41 - 2TiB.  If
having compact search tree is so important (I kinda have hard time
accepting that too tho), why not just use bin search tree of either
u32 or u64 depending on the range?  In most cases u32 will be used and
it's not like float can cover large areas well anyway.  It can
degenerate into essentially linear search depending on key
distribution (e.g. partition table read at sector 0 followed by bunch
of high partition accesses) - u64 tree will show much better bound
behavior in cases where key range becomes large.  Wouldn't that be
*far* simpler and easier to understand and maintain without much
downside?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/