lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070412122116.GD28148@thunk.org>
Date:	Thu, 12 Apr 2007 08:21:16 -0400
From:	Theodore Tso <tytso@....edu>
To:	Neil Brown <neilb@...e.de>
Cc:	Jörn Engel <joern@...ybastard.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Christoph Hellwig <hch@...radead.org>,
	Ulrich Drepper <drepper@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: If not readdir() then what?

On Thu, Apr 12, 2007 at 03:57:41PM +1000, Neil Brown wrote:
> But my perspective is that a solution in nfsd at-best a work-around.
> Caching the whole 'struct file' when there is just a small bit that we
> might want seems like a heavy hammer.  The filesystem is in the best
> place to know what needs to be cached, and it should be the one doing
> the caching.

Sure, but ext3 needs to cache the information in the file handle,
because it's dynamic, per-directory stream information that needs to
be cached.  The fundamental problem is the broken nature of the 64-bit
cookie; it simply isn't big enough.  So what's being disputed is who
gets to pay the cost of that particular design mistake, in POSIX and
in NFS?

In the POSIX case, right now only applications that use
telldir/seekdir pay the cost, which is they might see some repeated
directory entries in the case of hash collisions.

Unfortunately, in the NFS case if there are hash collisions, under the
wrong circumstances the NFS client could loop forever trying to
readdir() a directory stream.

> This is a simple consequence of the design decision to use hashes as
> the search key.  They aren't dense and they will collide.  So the
> solution will be a bit fuzzy around the edges.  And maybe that is an
> acceptable tradeoff.  But the filesystem should take full
> responsibility for it, whether in performance or correctness :-)

Well, we could also say that it is NFS's fault that they used a
limited size cookie as a fundamental part of their protocol.... 

> But there are alternatives.  e.g. internal chaining.
> Insist on a unique 64bit hash for every file.  If the hash is in use,
> increment and try again.  On lookup, if the hash leads you to a file
> with the wrong name, increment and try again until you find a hole
> (hash value that is not stored).  When you delete an entry, leave a
> place holder if the next hash is in use.  Conversely if the next hash
> is not in use, delete the entry and delete the previous one if it is a
> place holder.

This solution requires an incompatible file format change.  In
addition, it means trying to garbage collect directory entries when at
the beginning or end of a directory block without dragging in the
previous or next directory block.  This is a huge amount of hair, will
screw over performance, and is not compatible with how we do things
today.  

(It also means that you have to store the hash in the directory entry,
which we don't do today, since we can always calculate the hash from
the file name.  But if in some cases the hash is going to be some
small integer plus the calculated hash, you have to bloat the
directory entry by an extra 8 bytes per dirent.)

Again, compared to a directory fd cache, what you're proposing a huge
hit to the filesystem, and at the moment, given that telldir/seekdir
is rarely used by everyone else, it's mainly NFS which is the main bad
actor here by insisting on the use of a small 31/63-bit cookie as a
condition of protocol correctness.

						- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ