linux-kernel - Re: [PATCH RFC] vfs: make fstatat retry on ESTALE errors from getattr call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120417093222.2ff5e1bd@corrin.poochiereds.net>
Date:	Tue, 17 Apr 2012 09:32:22 -0400
From:	Jeff Layton <jlayton@...hat.com>
To:	Miklos Szeredi <miklos@...redi.hu>
Cc:	"Myklebust\, Trond" <Trond.Myklebust@...app.com>,
	Bernd Schubert <bernd.schubert@...m.fraunhofer.de>,
	Malahal Naineni <malahal@...ibm.com>,
	"linux-nfs\@vger.kernel.org" <linux-nfs@...r.kernel.org>,
	"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
	"pstaubach\@exagrid.com" <pstaubach@...grid.com>,
	"viro\@ZenIV.linux.org.uk" <viro@...IV.linux.org.uk>,
	"hch\@infradead.org" <hch@...radead.org>,
	"michael.brantley\@deshaw.com" <michael.brantley@...haw.com>,
	"sven.breuner\@itwm.fraunhofer.de" <sven.breuner@...m.fraunhofer.de>
Subject: Re: [PATCH RFC] vfs: make fstatat retry on ESTALE errors from
 getattr call

On Tue, 17 Apr 2012 15:12:20 +0200
Miklos Szeredi <miklos@...redi.hu> wrote:

> Jeff Layton <jlayton@...hat.com> writes:
> 
> >> 
> >> Won't something like fstatat(AT_FDCWD, "", &stat, AT_EMPTY_PATH) risk
> >> looping forever there, or am I missing something?
> >> 
> >
> > To make sure I understand, that should be "shortcut" for a lookup of the
> > cwd?
> >
> > So I guess the concern is that you'd do the above and get a successful
> > lookup since you're just going to get back the cwd. At that point,
> > you'd attempt the getattr and get ESTALE back. Then, you'd redo the
> > lookup with LOOKUP_REVAL set -- but since we're operating on the
> > cwd, we don't have a way to redo the lookup since we don't have a
> > pathname that we can look up again...
> >
> > So yeah, I guess if you're sitting in a stale directory, something like
> > that could loop eternally.
> >
> > Do you think the proposed check for fatal_signal_pending is enough to
> > mitigate such a problem? Or do we need to limit the number of retries
> > to address those sorts of loops?
> 
> Lets step back a bit.
> 
> The retry is needed when when we discover during ->getattr() that the
> cached lookup returned a stale file handle.
> 
> If the lookup wasn't cached or if there was no lookup at all
> (stat(".") and friends) then retrying will not gain anything.
> 

That's not necessarily the case, at least not with NFS. It's easily
possible for you to do a full-fledged lookup over the wire, and then
for that inode to be removed prior to issuing a call against the FH that
you got back. 

> And that also means that retrying multiple times is pointless, since
> after the first retry we are sure to have up-to-date attributes.
> 

Again, it's not pointless. It's possible (though somewhat pathological)
for you to hit the race above more than once in the same operation.
Granted, it's an unlikely race but it is possible.

> Unfortunately it's impossible for the filesystem to know whether a
> ->getattr (or other inode operation) was perfromed after a cached or a
> non-cached lookup.
> 
> I'm not sure what the right interface for this would be.  One would be
> to just pass the "cached-or-not" information as a flag.  That works for
> getattr() but not for other operations.
> 
> Another is to introduce atomic lookup+foo variants of these operations
> just like for open.  E.g. the lookup+getattr is called if the cached
> lookup fails or if the cached lookup succeeds and the plain ->getattr
> call returns ESTALE.
>

To do that would require protocol support that we simply don't have. We
don't have a way to (for instance) say via NFS "give me the attributes
for this filename". Well, at least not for NFSv3...

With v4 you could theoretically construct a compound that does that,
but you'd have to assume that the server won't release the reference to
the inode midway through the compound. That's a reasonably safe
assumption.

While it's nice to consider new atomic ops like this, it's not really
possible with earlier versions of NFS.

-- 
Jeff Layton <jlayton@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/