[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87boo9k861.fsf@tucsk.pomaz.szeredi.hu>
Date: Tue, 06 Mar 2012 17:26:14 +0100
From: Miklos Szeredi <miklos@...redi.hu>
To: "Myklebust\, Trond" <Trond.Myklebust@...app.com>
Cc: "viro\@ZenIV.linux.org.uk" <viro@...IV.linux.org.uk>,
"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
"hch\@infradead.org" <hch@...radead.org>
Subject: Re: [PATCH 9/9] nfs: don't open in ->d_revalidate
"Myklebust, Trond" <Trond.Myklebust@...app.com> writes:
> On Tue, 2012-03-06 at 13:56 +0100, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@...e.cz>
>>
>> NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
>> mount needs to be followed or not. It does check d_mountpoint() on the dentry,
>> which can result in a weird error if the VFS found that the mount does not in
>> fact need to be followed, e.g.:
>>
>> # mount --bind /mnt/nfs /mnt/nfs-clone
>> # echo something > /mnt/nfs/tmp/bar
>> # echo x > /tmp/file
>> # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
>> # cat /mnt/nfs/tmp/bar
>> cat: /mnt/nfs/tmp/bar: Not a directory
>>
>> Which should, by any sane filesystem, result in "something" being printed.
>>
>> So instead do the open in f_op->open() and in the unlikely case that the cached
>> dentry turned out to be invalid, drop the dentry and return ESTALE to let the
>> VFS retry.
>
> This patch would force a complete new walk of the path in cases where
> today we just do a single lookup of the last component.
I sort of anticipated that, but I really don't know how much of an issue
is this.
Of course it is possible to redo only the last component after
f_op->open() return ESTALE but that requires reshuffling do_last().
> It really
> doesn't seem worth taking that penalty just in order to make some insane
> bind mount corner cases work.
It's not at all insane if for example we have multiple mount namespaces.
NFS is clearly broken and it needs to be fixed, one way or another.
If you think this is a serious performance regression then lets drop
this patch and add it to the atomic open series together with the
do_last() reshuffling.
Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists