[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130213133133.GB23233@ndevos-laptop.usersys.redhat.com>
Date: Wed, 13 Feb 2013 14:31:33 +0100
From: Niels de Vos <ndevos@...hat.com>
To: "J. Bruce Fields" <bfields@...ldses.org>
Cc: Bernd Schubert <bernd.schubert@...m.fraunhofer.de>,
sandeen@...hat.com, Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, "Theodore Ts'o" <tytso@....edu>,
gluster-devel@...gnu.org
Subject: Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies
On Tue, Feb 12, 2013 at 04:00:54PM -0500, J. Bruce Fields wrote:
> On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote:
> > On 02/12/2013 09:28 PM, J. Bruce Fields wrote:
> > > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)"
> > > and previous patches solved problems with hash collisions in large
> > > directories by using 64- instead of 32- bit directory hashes in some
> > > cases. But it caused problems for users who assume directory offsets
> > > are "small". Two cases we've run across:
> > >
> > > - older NFS clients: 64-bit cookies cause applications on many
> > > older clients to fail.
> > > - gluster: gluster assumed that it could take the top bits of
> > > the offset for its own use.
> > >
> > > In both cases we could argue we're in the right: the nfs protocol
> > > defines cookies to be 64 bits, so clients should be prepared to handle
> > > them (remapping to smaller integers if necessary to placate applications
> > > using older system interfaces). And gluster was incorrect to assume
> > > that the "offset" was really an "offset" as opposed to just an opaque
> > > value.
> > >
> > > But in practice things that worked fine for a long time break on a
> > > kernel upgrade.
> > >
> > > So at a minimum I think we owe people a workaround, and turning off
> > > dir_index may not be practical for everyone.
> > >
> > > A "no_64bit_cookies" export option would provide a workaround for NFS
> > > servers with older NFS clients, but not for applications like gluster.
> > >
> > > For that reason I'd rather have a way to turn this off on a given ext4
> > > filesystem. Is that practical?
> >
> > I think Ted needs to answer if he would accept another mount option. But
> > before we are going this way, what is gluster doing if there are hash
> > collions?
>
> They probably just haven't tested NFS with large enough directories.
> The birthday paradox says you'd need about 2^16 entries to have a 50-50
> chance of hitting the problem.
The Gluster NFS-server gets into an infinite loop:
- https://bugzilla.redhat.com/show_bug.cgi?id=838784
The general advise (even before this Bug) is that XFS should be used,
which is not affected with this problem (yet?).
Cheers,
Niels
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists