[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <03e91ee4c56829995c08f4f8fb1052d3c6cc40c4.camel@kernel.org>
Date: Wed, 26 Apr 2023 05:48:38 -0400
From: Jeff Layton <jlayton@...nel.org>
To: Christian Brauner <brauner@...nel.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
"Darrick J. Wong" <djwong@...nel.org>,
Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Chinner <david@...morbit.com>,
Chuck Lever <chuck.lever@...cle.com>, Jan Kara <jack@...e.cz>,
Amir Goldstein <amir73il@...il.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-xfs@...r.kernel.org, linux-mm@...ck.org,
linux-nfs@...r.kernel.org
Subject: Re: [PATCH v2 1/3] fs: add infrastructure for multigrain inode
i_m/ctime
On Wed, 2023-04-26 at 09:07 +0200, Christian Brauner wrote:
> On Mon, Apr 24, 2023 at 11:11:02AM -0400, Jeff Layton wrote:
> > The VFS always uses coarse-grained timestamp updates for filling out the
> > ctime and mtime after a change. This has the benefit of allowing
> > filesystems to optimize away a lot metaupdates, to around once per
> > jiffy, even when a file is under heavy writes.
> >
> > Unfortunately, this has always been an issue when we're exporting via
> > NFSv3, which relies on timestamps to validate caches. Even with NFSv4, a
> > lot of exported filesystems don't properly support a change attribute
> > and are subject to the same problems with timestamp granularity. Other
> > applications have similar issues (e.g backup applications).
> >
> > Switching to always using fine-grained timestamps would improve the
> > situation for NFS, but that becomes rather expensive, as the underlying
> > filesystem will have to log a lot more metadata updates.
> >
> > What we need is a way to only use fine-grained timestamps when they are
> > being actively queried:
> >
> > Whenever the mtime changes, the ctime must also change since we're
> > changing the metadata. When a superblock has a s_time_gran >1, we can
> > use the lowest-order bit of the inode->i_ctime as a flag to indicate
> > that the value has been queried. Then on the next write, we'll fetch a
> > fine-grained timestamp instead of the usual coarse-grained one.
> >
> > We could enable this for any filesystem that has a s_time_gran >1, but
> > for now, this patch adds a new SB_MULTIGRAIN_TS flag to allow filesystems
> > to opt-in to this behavior.
>
> Hm, the patch raises the flag in s_flags. Please at least move this to
> s_iflags as SB_I_MULTIGRAIN and treat this as an internal flag. There's
> no need to give the impression that this will become a mount option.
>
> Also, this looks like it's a filesystem property not a superblock
> property as the granularity isn't changeable. So shouldn't this be an
> FS_* flag instead?
It could be a per-sb thing if there was some filesystem that wanted to
do that, but I'm hoping that most will not want to do that.
My initial patches for this actually did use a FS_* flag, but I figured
that was one more pointer to chase when you wanted to check the flag.
I can change it back to that though. Let me know what you'd prefer.
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists