[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240822203510.GS865349@frogsfrogsfrogs>
Date: Thu, 22 Aug 2024 13:35:10 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: Christoph Hellwig <hch@....de>
Cc: John Garry <john.g.garry@...cle.com>,
Dave Chinner <david@...morbit.com>, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-xfs@...r.kernel.org
Subject: Re: [PATCH v3 14/21] iomap: Sub-extent zeroing
On Fri, Jul 26, 2024 at 07:13:58PM +0200, Christoph Hellwig wrote:
> On Fri, Jul 26, 2024 at 03:29:48PM +0100, John Garry wrote:
> > I have been considering another approach to solve this problem.
> >
> > In this patch - as you know - we zero unwritten parts of a newly allocated
> > extent. This is so that when we later issue an atomic write, we would not
> > have the problem of unwritten extents and how the iomap iterator will
> > create multiple BIOs (which is not permitted).
> >
> > How about an alternate approach like this:
> > - no sub-extent zeroing
> > - iomap iter is changed to allocate a single BIO for an atomic write in
> > first iteration
> > - each iomap extent iteration appends data to that same BIO
> > - when finished iterating, we submit the BIO
> >
> > Obviously that will mean many changes to the iomap bio iterator, but is
> > quite self-contained.
>
> Yes, I also suggested that during the zeroing fix discussion. There
> is generally no good reason to start a new direct I/O bio if the
> write is contiguous on disk and only the state of the srcmap is different.
> This will also be a big win for COW / out of place overwrites.
But what happens if the pre-write state is:
WUWUWUWU
You can write all 8 blocks with a single bio, but the directio write
completion has to run four separate transactions to convert the four
unwritten mappings. For COW it's ok if we crash midway through the
ioend such that a read after recovery sees this:
WWWWW0W0
because we've never guaranteed what happens if the system crashes before
fsync completes. For untorn writes this is not allowed (even if the
actual disk contents landed successfully) because we said we wouldn't
tear the write.
--D
Powered by blists - more mailing lists