[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150516144527.20b89194@notabene.brown>
Date: Sat, 16 May 2015 14:45:27 +1000
From: NeilBrown <neilb@...e.de>
To: Al Viro <viro@...IV.linux.org.uk>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andreas Dilger <adilger@...ger.ca>,
Dave Chinner <david@...morbit.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Christoph Hellwig <hch@...radead.org>
Subject: Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU
symlinks
On Sat, 16 May 2015 02:47:18 +0100 Al Viro <viro@...IV.linux.org.uk> wrote:
> On Sat, May 16, 2015 at 11:25:03AM +1000, NeilBrown wrote:
> > But surely those things can be managed with a spinlock.
> >
> > I think a big part of the problem is that the VFS tries to control
> > filesystems rather than provide services to them.
>
> What with being the thing syscalls talk to for sending the requests to
> filesystems... Do you really want to push the pathname resolution into
> fs code? You've looked at it lately, right?
Yes, I've looked lately :-)
I think that all of RCU-walk, and probably some of REF-walk should happen
before the filesystem gets to see anything.
But once you hit a non-positive dentry or the parent of the target name, I'd
rather hand over the the FS.
NFSv4 has the ability to look up multiple components in a single LOOKUP call.
VFS doesn't give it a chance to try because it wants to go step-by-step, and
wants each entry in the cache to have an inode etc.
The earlier the filesystem gets control, the less completely-general the VFS
needs to be.
>
> > I'm not convinced that serialising 'lookup' calls is vital. If two threads
> > find a 'not-validated' dentry, and both try to look up the inode, they
> > will both ultimately get the same struct_inode from the icache, and will both
> > succeed in connecting it to the dentry. Obviously it would be better to
> > avoid two concurrent NFS "LOOKUP" requests, but that is a problem for NFS to
> > solve. I suspect that using d_fsdata to point to a pending LOOKUP request
> > would allow the "second" thread to wait for that request to finish. Other
> > filesystems would take a completely different approach.
>
> See upthread regarding multiple negative dentries with the same name and fun
> consequences thereof. There might be _NO_ inode. At all. dcache has a large
> negative component and without it you'd get really fucked on NFS as soon
> as you try to compile anything. Shitloads of headers, looked up in a lot of
> directories. Most of the lookups ending up negative. We really do need that
> stuff...
Of course negative dentries are important and having multiple would be
unfortunate. I don't suggest that for a moment.
I'm suggesting three different states for a dentry: positive, negative, don't
know. "don't know" is a new state that isn't currently allowed.
While a filesystem is performing 'lookup', doing its own locking or not, the
dentry would be "don't know". Anything that needed to know would block
somewhere in the filesystem code on whatever lock or waitqueue or whatever
that the filesystem developer felt as appropriate. On i_mutex if
generic_foo() was in use.
If NFSv4 did a multi-component lookup, the intermediate dentries would be
"don't know" even while they had children. For local filesystems, that sort
of thing would never happen. For NFS - which has to allow for random changes
on the server anyway - it is just part of the game.
NeilBrown
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists