lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Mar 2012 16:22:52 -0400
From:	Phillip Susi <phillsusi@...il.com>
To:	Ted Ts'o <tytso@....edu>, Andreas Dilger <adilger@...mcloud.com>,
	Lukas Czerner <lczerner@...hat.com>,
	Jacek Luczak <difrost.kernel@...il.com>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-btrfs@...r.kernel.org" <linux-btrfs@...r.kernel.org>
Subject: Re: getdents - ext4 vs btrfs performance

On 3/13/2012 3:53 PM, Ted Ts'o wrote:
> Because that would be a format change.

I think a format change would be preferable to runtime sorting.

> What we have today is not a hash table; it's a hashed tree, where we
> use a fixed-length key for the tree based on the hash of the file
> name.  Currently the leaf nodes of the tree are the directory blocks
> themselves; that is, the lowest level of the index blocks tells you to
> look at directory block N, where that directory contains the directory
> indexes for those file names which are in a particular range (say,
> between 0x2325777A and 0x2325801).

So the index nodes contain the hash ranges for the leaf block, but the 
leaf block only contains the regular directory entries, not a hash for 
each name?  That would mean that adding or removing names would require 
moving around the regular directory entries wouldn't it?

> If we aren't going to change the ordering of the directory directory,
> that means we would need to change things so the leaf nodes contain
> the actual directory file names themselves, so that we know whether or
> not we've hit the correct entry or not before we go to read in a
> specific directory block (otherwise, you'd have problems dealing with
> hash collisions).  But in that case, instead of storing the pointer to
> the directory entry, since the bulk of the size of a directory entry
> is the filename itself, you might as well store the inode number in
> the tree itself, and be done with it.

I would think that hash collisions are rare enough that reading a 
directory block you end up not needing once in a blue moon would be 
chalked up under "who cares".  So just stick with hash, offset pairs to 
map the hash to the normal directory entry.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists