[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230427-rebel-vergibt-99cf6a7838a2@brauner>
Date: Thu, 27 Apr 2023 11:51:11 +0200
From: Christian Brauner <brauner@...nel.org>
To: Jeff Layton <jlayton@...nel.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
"Darrick J. Wong" <djwong@...nel.org>,
Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Chinner <david@...morbit.com>,
Chuck Lever <chuck.lever@...cle.com>, Jan Kara <jack@...e.cz>,
Amir Goldstein <amir73il@...il.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-xfs@...r.kernel.org, linux-mm@...ck.org,
linux-nfs@...r.kernel.org
Subject: Re: [PATCH v2 1/3] fs: add infrastructure for multigrain inode
i_m/ctime
On Wed, Apr 26, 2023 at 05:48:38AM -0400, Jeff Layton wrote:
> On Wed, 2023-04-26 at 09:07 +0200, Christian Brauner wrote:
> > On Mon, Apr 24, 2023 at 11:11:02AM -0400, Jeff Layton wrote:
> > > The VFS always uses coarse-grained timestamp updates for filling out the
> > > ctime and mtime after a change. This has the benefit of allowing
> > > filesystems to optimize away a lot metaupdates, to around once per
> > > jiffy, even when a file is under heavy writes.
> > >
> > > Unfortunately, this has always been an issue when we're exporting via
> > > NFSv3, which relies on timestamps to validate caches. Even with NFSv4, a
> > > lot of exported filesystems don't properly support a change attribute
> > > and are subject to the same problems with timestamp granularity. Other
> > > applications have similar issues (e.g backup applications).
> > >
> > > Switching to always using fine-grained timestamps would improve the
> > > situation for NFS, but that becomes rather expensive, as the underlying
> > > filesystem will have to log a lot more metadata updates.
> > >
> > > What we need is a way to only use fine-grained timestamps when they are
> > > being actively queried:
> > >
> > > Whenever the mtime changes, the ctime must also change since we're
> > > changing the metadata. When a superblock has a s_time_gran >1, we can
> > > use the lowest-order bit of the inode->i_ctime as a flag to indicate
> > > that the value has been queried. Then on the next write, we'll fetch a
> > > fine-grained timestamp instead of the usual coarse-grained one.
> > >
> > > We could enable this for any filesystem that has a s_time_gran >1, but
> > > for now, this patch adds a new SB_MULTIGRAIN_TS flag to allow filesystems
> > > to opt-in to this behavior.
> >
> > Hm, the patch raises the flag in s_flags. Please at least move this to
> > s_iflags as SB_I_MULTIGRAIN and treat this as an internal flag. There's
> > no need to give the impression that this will become a mount option.
> >
> > Also, this looks like it's a filesystem property not a superblock
> > property as the granularity isn't changeable. So shouldn't this be an
> > FS_* flag instead?
>
> It could be a per-sb thing if there was some filesystem that wanted to
> do that, but I'm hoping that most will not want to do that.
Yeah, I'd really hope this isn't an sb thing.
>
> My initial patches for this actually did use a FS_* flag, but I figured
Oh, I might've just missed that.
> that was one more pointer to chase when you wanted to check the flag.
Hm, unless you have reasons to think that it would be noticable in terms
of perf I'd rather do the correct thing and have it be an FS_* flag.
Powered by blists - more mailing lists