lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 12 Apr 2007 08:32:05 +1000
From:	Neil Brown <neilb@...e.de>
To:	Theodore Tso <tytso@....edu>
Cc:	Jörn Engel <joern@...ybastard.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Christoph Hellwig <hch@...radead.org>,
	Ulrich Drepper <drepper@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: If not readdir() then what?

On Wednesday April 11, tytso@....edu wrote:
> On Wed, Apr 11, 2007 at 07:15:57AM +1000, Neil Brown wrote:
> > > Neil, is this correct?
> > 
> > It's just NFS.  nfsd does open/getdents/close on every readdir
> > request.
> 
> That's... unfortunate.

I like to think of it as "challenging" :-)

> 
> 
> So the problem with nfsd doing an open/getdents/close on every readdir
> request is two-fold.  First of all, it means that you are incurring
> extra overhead by forcing ext3 to allocate, create, and free the
> red-black tree on every single NFS readdir request.  Secondly, it
> increases the chances of problems with the hash collision logic.

I guess we need a two-fold solution then  :-)

For the first:
  You are storing an internal tree representation of part of the
  directory attached to the 'struct file'.
  Would it be possible to store this attached to the page via the
  ->private pointer?  What would avoid the allocate/create/free
  overhead on every request.

  You suggest caching the open files in nfsd.  While that is probably
  possible (I've thought of it a number of times) is would also be
  quite complex, e.g. requiring some sort of call-back to close all
  those files when the filesystem is unexported.  And it is very easy
  to get caching heuristics wrong.  Leveraging the page-cache which is
  a very mature cache seems to make a lot of sense.

  The cursor would still be in the 'struct file', but if the second
  second fold works, that should be much less critical.

For the second.
  You say that you " would need at least 96 bits in order to make that
  guarantee; 64 bits of hash, plus a 32-bit count value in the hash
  collision chain".  I think 96 is a bit greedy.  Surely 48 bits of 
  hash and 16 bits of collision-chain-position would plenty.  You would
  need 65537 entries before a collision was even possible, and
  billions before it was at all likely. (How big does a set of 48bit
  numbers have to get before the probability that "No subset of 65536
  numbers are all the same" drops below 0.95?)

  This would really require that the collision-chain-index was stable
  across create/delete.  Doing that while you have the tree in the
  page cache is probably easy enough.  Doing it across reboots is
  probably not possible without on-disk changes.

So while my two-fold solution is not perfect(*), I think it is
substantially better than what we have now, and is extremely unlike to
ever show an observable problem.
(*)And perfection can only be achieved we two independent indexes.


This requires 64bit cookies which NFSv2 cannot handle.  I think it is
OK to say "tough".  We should probably make sure that READDIRv2 *always*
fails on a directory which is indexed, and possibly on a filesystem
which allows index directories.  People can always disable indexing if
they want a v2 export.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ