[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <775371813836c06af830d9dbf6b191728636e911.camel@kernel.org>
Date: Fri, 14 Nov 2025 12:21:08 -0500
From: Jeff Layton <jlayton@...nel.org>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: Christoph Hellwig <hch@....de>, Christian Brauner <brauner@...nel.org>,
Al Viro <viro@...iv.linux.org.uk>, David Sterba <dsterba@...e.com>, Jan
Kara <jack@...e.cz>, Mike Marshall <hubcap@...ibond.com>, Martin
Brandenburg <martin@...ibond.com>, Carlos Maiolino <cem@...nel.org>,
Stefan Roesch <shr@...com>, linux-kernel@...r.kernel.org,
linux-btrfs@...r.kernel.org, gfs2@...ts.linux.dev,
io-uring@...r.kernel.org, devel@...ts.orangefs.org,
linux-unionfs@...r.kernel.org, linux-mtd@...ts.infradead.org,
linux-xfs@...r.kernel.org, linux-nfs@...r.kernel.org
Subject: Re: re-enable IOCB_NOWAIT writes to files
On Fri, 2025-11-14 at 09:01 -0800, Darrick J. Wong wrote:
> On Fri, Nov 14, 2025 at 09:04:58AM -0500, Jeff Layton wrote:
> > On Fri, 2025-11-14 at 07:26 +0100, Christoph Hellwig wrote:
> > > Hi all,
> > >
> > > commit 66fa3cedf16a ("fs: Add async write file modification handling.")
> > > effectively disabled IOCB_NOWAIT writes as timestamp updates currently
> > > always require blocking, and the modern timestamp resolution means we
> > > always update timestamps. This leads to a lot of context switches from
> > > applications using io_uring to submit file writes, making it often worse
> > > than using the legacy aio code that is not using IOCB_NOWAIT.
> > >
> > > This series allows non-blocking updates for lazytime if the file system
> > > supports it, and adds that support for XFS.
> > >
> > > It also fixes the layering bypass in btrfs when updating timestamps on
> > > device files for devices removed from btrfs usage, and FMODE_NOCMTIME
> > > handling in the VFS now that nfsd started using it. Note that I'm still
> > > not sure that nfsd usage is fully correct for all file systems, as only
> > > XFS explicitly supports FMODE_NOCMTIME, but at least the generic code
> > > does the right thing now.
> > >
> > > Diffstat:
> > > Documentation/filesystems/locking.rst | 2
> > > Documentation/filesystems/vfs.rst | 6 ++
> > > fs/btrfs/inode.c | 3 +
> > > fs/btrfs/volumes.c | 11 +--
> > > fs/fat/misc.c | 3 +
> > > fs/fs-writeback.c | 53 ++++++++++++++----
> > > fs/gfs2/inode.c | 6 +-
> > > fs/inode.c | 100 +++++++++++-----------------------
> > > fs/internal.h | 3 -
> > > fs/orangefs/inode.c | 7 ++
> > > fs/overlayfs/inode.c | 3 +
> > > fs/sync.c | 4 -
> > > fs/ubifs/file.c | 9 +--
> > > fs/utimes.c | 1
> > > fs/xfs/xfs_iops.c | 29 ++++++++-
> > > fs/xfs/xfs_super.c | 29 ---------
> > > include/linux/fs.h | 17 +++--
> > > include/trace/events/writeback.h | 6 --
> > > 18 files changed, 152 insertions(+), 140 deletions(-)
> >
> > This all looks pretty reasonable to me. There are a few changelog and
> > subject line typos, but the code changes look fine. You can add:
> >
> > Reviewed-by: Jeff Layton <jlayton@...nel.org>
> >
> > As far as nfsd's usage of FMODE_NOCMTIME, it looks OK to me. That's
> > implemented today by the check in file_modified_flags(), which is
> > generic and should work across filesystems.
> >
> > The main exception is xfs_exchange_range() which has some special
> > handling for it, but nfsd doesn't use that functionality so that
> > shouldn't be an issue.
> >
> > Am I missing some subtlety?
>
> In exchangerange specifically?
>
> The FMODE_NOCMTIME checks in xfs_exchange_range exist to tell the
> exchange-range code to update cmtime, but only if it decides to actually
> go through with the mapping exchange. Since the mapping exchange
> requires a transaction anyway, it's cheap to bundle in timestamp
> updates.
>
> Also there's no way that we can do nonblocking exchangerange so a NOWAIT
> flag wouldn't be much help here anyway.
>
> (I hope that answers your question)
>
>
Christoph mentioned nfsd might be doing something wrong, which is my
main interest here. nfsd doesn't have a way to expose exchangerange
functionality right now, but if it did then it seems like that would
just work too.
HCH says:
> Nothing requires file_update_time / file_modified_flags are helpers
> that a file system may or may not call. I've not done an audit
> if everyone actually uses them.
I'll have to think about how to efficiently audit that. The good news
is that nfsd really only cares about the write() and page_mkwrite()
codepaths. For other activity, the delegation will be broken and
recalled.
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists