[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230906175725.GF28160@frogsfrogsfrogs>
Date: Wed, 6 Sep 2023 10:57:25 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Matthew Wilcox <willy@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-bcachefs@...r.kernel.org,
Christian Brauner <christian@...uner.io>
Subject: Re: [GIT PULL] bcachefs
On Wed, Sep 06, 2023 at 12:10:02PM -0400, Kent Overstreet wrote:
> On Wed, Sep 06, 2023 at 01:41:22AM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 05, 2023 at 08:00:07PM -0400, Kent Overstreet wrote:
> > > On Tue, Sep 05, 2023 at 06:24:47AM -0700, Christoph Hellwig wrote:
> > > > Hi Kent,
> > > >
> > > > I thought you'd follow Christians proposal to actually work with
> > > > people to try to use common APIs (i.e. to use iomap once it's been
> > > > even more iter-like, for which I'm still waiting for suggestions),
> > > > and make your new APIs used more widely if they are a good idea
> > > > (which also requires explaining them better) and aim for 6.7 once
> > > > that is done.
> > >
> > > Christoph, I get that iomap is your pet project and you want to make it
> > > better and see it more widely used.
Christoph quit as maintainer 78 days ago.
> > > But the reasons bcachefs doesn't use iomap have been discussed at
> > > length, and I've posted and talked about the bcachefs equivalents of
> > > that code. You were AWOL on those discussions; you consistently say
> > > "bcachefs should use iomap" and then walk away, so those discussions
> > > haven't moved forwards.
> > >
> > > To recap, besides being more iterator like (passing data structures
> > > around with iterators, vs. indirect function calls into the filesystem),
> > > bcachefs also hangs a bit more state off the pagecache, due to being
> > > multi device and also using the same data structure for tracking disk
> > > reservations (because why make the buffered write paths look that up
> > > separately?).
> >
> > I /thought/ the proposal was to use iomap for bcachefs DIO and leave
I wasn't aware of /any/ active effort to make bcachefs use iomap for any
purpose.
> > buffered writes for a different day. I agree the iomap buffered write
> > path is inappropriate for bcachefs today. I'd like that to change,
> > but there's a lot of functionality that it would need to support.
>
> No, I'm not going to convert the bcachefs DIO path to iomap; not as it
> exitss now.
>
> Right now I've got a clean separation between the VFS level DIO code,
> and the lower level bcachefs code that walks the extents btree and
> issues IOs. I have to consider the iomap approach where the
> loop-up-mappings-and-issue loop is in iomap code but calling into
> filesystem code pretty gross.
>
> I was talking about this /years/ ago when I did the work to make it
> possible to efficiently split bios - iomap could have taken the approach
> bcachefs did, the prereqs were in place when iomap was started, but it
> didn't happen - iomap ended up being a more conservative approach, a
> cleaned up version of buffer heads and fs/direct-IO.c.
<shrug> iirc the genesis of xfs "iomap" was that it was created to
supply space mappings to pnfs, and then got reused for directio and
pagecache operations. Later that was hoisted wholesale to the vfs.
Then spectre/meltdown happened and our indirect function call toy got
taken away, and now we're stuck figuring out how to remove them all.
IOWs, individual tactics that each made sense for maintaining the
overall stability of xfs, but overall didn't amount to anything
ambitious.
In the end I'd probably rather do something like this to eliminate all
the indirect function calls:
int
xfs_zero_range(
struct xfs_inode *ip,
loff_t pos,
loff_t len,
bool *did_zero)
{
struct iomap_iter iter = { };
struct inode *inode = VFS_I(ip);
int ret = 0;
iomap_start_zero_range(&iter, inode, pos, len);
while (iter.len > 0) {
ret = xfs_buffered_write_iomap_begin(&iter, &iter.write_map,
&iter.read_map);
if (ret)
break;
len = iomap_zero_range_iter(&iter, did_zero);
if (len < 0) {
ret = len;
break;
}
ret = xfs_buffered_write_iomap_end(&iter, len, &iter.write_map);
if (ret)
break;
iomap_iter_advance(&iter, len);
}
iomap_end_zero_range(&iter);
return iter.write > 0 ? iter.write : ret;
}
(I would also rename iter.iomap and iter.srcmap to make it more obvious
which ones get filled out under what circumstances.)
> That's fine, iomap is certainly an improvement over what it was
> replacing, but it would not be an improvement for bcachefs.
Filesystems are free to implement read_ and write_iter as they choose,
whether or not that involves iomap.
> I think it might be more fruitful to look at consolidating the buffered
> IO code in bcachefs and iomap. The conceptual models are a bit closer,
> bcachefs's buffered IO code is just a bit more fully-featured in that it
> does the dirty block tracking in a unified way. That was a project that
> turned out pretty nicely.
I think I'd much rather gradually revise iomap for everyone and cut
bcachefs over when its ready. I don't think revising iomap is a
reasonable precondition for merging bcachefs, nor do I think it's a good
idea to risk destabilizing bcachefs just to hack in a new dependency
that won't even fit well.
--D
Powered by blists - more mailing lists