[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161229161534.GA29261@fieldses.org>
Date:   Thu, 29 Dec 2016 11:15:34 -0500
From:   "J. Bruce Fields" <bfields@...ldses.org>
To:     Richard Weinberger <richard@....at>
Cc:     linux-mtd@...ts.infradead.org, david@...ma-star.at, tytso@....edu,
        dedekind1@...il.com, adrian.hunter@...el.com,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        adilger.kernel@...ger.ca, akpm@...ux-foundation.org,
        linux-ext4@...r.kernel.org
Subject: Re: [PATCH 3/6] ubifs: Use 64bit readdir cookies
On Thu, Dec 29, 2016 at 04:49:54PM +0100, Richard Weinberger wrote:
> Bruce,
> 
> On 29.12.2016 16:34, J. Bruce Fields wrote:
> >> That way UBIFS can provide a 64bit readdir() cookie which is required for NFS3.
> > 
> > Sounds good.  And if a matching entry isn't found (as in the case of a
> > concurrent unlink), what happens?  The answer must be the same as for
> > ext4, but I've forgotten the details....  I guess it must find the next
> > highest cookie (thinking of the cookie as a 64-bit integer of some kind)
> > that exists in the directory.  And that must be the same order that
> > readdir normally returns entries in.
> 
> If a 64bit cookie is not found, the lookup function returns -ENOENT.
> In UBIFS we cannot just select a higher or lower key (cookie in this case),
> since it is the B-tree key and would point to a completely different
> entry.
> 
> So, in case of a concurrent unlink() one would succeed and one fail with
> -ENOENT. Unless I miss something that seems okay to me.
Unlink takes (parent directory, name), not a directory cookie.
The problem is concurrent unlink and nfs readdir.  So:
	NFS server returns readdir result with cookie X
	Somebody unlinks the entry at X.
	NFS server gets readdir request with cookie X.
Then the NFS client will get a spurious -ENOENT.
I'm not sure how best to reproduce that.... Maybe:
	Create a directory on an nfs-exported filesystem with lots of
	entries.
	Start a loop (or loops?) renaming directory entries within the
	directory as fast as possible (or deleting and creating entries;
	I assume it's the same thing for our purposes).
	read the directory from an nfs client.
I'm not sure how many entries is "lots".... Ideally you want a single
read of the directory to require the client to make lots of READDIR
requests to the server.  You could help by running:
	echo 1024 >/proc/fs/nfsd/max_block_size
before starting knfsd.  That should force it to return no more than 1K
of data in each READDIR reply.
--b.
Powered by blists - more mailing lists
 
