[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120415192714.GA3842@fieldses.org>
Date: Sun, 15 Apr 2012 15:27:14 -0400
From: "J. Bruce Fields" <bfields@...ldses.org>
To: Bernd Schubert <bernd.schubert@...m.fraunhofer.de>
Cc: Jeff Layton <jlayton@...hat.com>,
Malahal Naineni <malahal@...ibm.com>,
linux-nfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, pstaubach@...grid.com,
miklos@...redi.hu, viro@...IV.linux.org.uk, hch@...radead.org,
michael.brantley@...haw.com, sven.breuner@...m.fraunhofer.de
Subject: Re: [PATCH RFC] vfs: make fstatat retry on ESTALE errors from
getattr call
On Sun, Apr 15, 2012 at 09:03:23PM +0200, Bernd Schubert wrote:
> On 04/13/2012 05:42 PM, Jeff Layton wrote:
> > (note: please don't trim the CC list!)
> >
> > Indefinitely does make some sense (as Peter articulated in his original
> > set). It's possible you could race several times in a row, or a server
> > misconfiguration or something has happened and you have a transient
> > error that will eventually recover. His assertion was that any limit on
> > the number of retries is by definition wrong. For NFS, a fatal signal
> > ought to interrupt things as well, so retrying indefinitely has some
> > appeal there.
> >
> > OTOH, we do have to contend with filesystems that might return ESTALE
> > persistently for other reasons and that might not respond to signals.
> > Miklos pointed out that some FUSE fs' do this in his review of Peter's
> > set.
> >
> > As a purely defensive coding measure, limiting the number of retries to
> > something finite makes sense. If we're going to do that though, I'd
> > probably recommend that we set the number of retries be something
> > higher just so that this is more resilient in the face of multiple
> > races. Those other fs' might "spin" a bit in that case but it is an
> > error condition and IMO resiliency trumps performance -- at least in
> this case.
>
> I am definitely voting against an infinite number of retries. I'm
> working on FhGFS, which supports distributed meta data servers. So when
> a file is moved around between directories, its file handle, which
> contains the meta-data target id might become invalid. As NFSv3 is
> stateless we cannot inform the client about that and must return ESTALE
> then.
Note we're not talking about retrying the operation that returned ESTALE
with the same filehandle--probably any server would return ESTALE again
in that case.
We're talking about re-looking up the path (in the case where we're
implementing a system call that takes a path as an argument), and then
retrying the operation with the newly looked-up filehandle.
--b.
> NFSv4 is better, but I'm not sure how well invalidating a file
> handle works. So retrying once on ESTALE might be a good idea, but
> retrying forever is not.
> Also, what about asymmetric HA servers? I believe to remember that also
> resulted in ESTALE. So for example server1 exports /home and /scratch,
> but on failure server2 can only take over /home and denies access to
> /scratch.
>
>
> Thanks,
> Bernd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists