[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANubcdXzxPuh9wweeW0yjprsQRZuBWmJwnEBcihqtvk6n7b=bQ@mail.gmail.com>
Date: Fri, 28 Nov 2025 11:22:49 +0800
From: Stephen Zhang <starzhangzsd@...il.com>
To: Andreas Gruenbacher <agruenba@...hat.com>
Cc: Christoph Hellwig <hch@...radead.org>, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org, nvdimm@...ts.linux.dev,
virtualization@...ts.linux.dev, linux-nvme@...ts.infradead.org,
gfs2@...ts.linux.dev, ntfs3@...ts.linux.dev, linux-xfs@...r.kernel.org,
zhangshida@...inos.cn
Subject: Re: [PATCH 1/9] block: fix data loss and stale date exposure problems
during append write
Andreas Gruenbacher <agruenba@...hat.com> 于2025年11月22日周六 00:13写道:
>
> On Fri, Nov 21, 2025 at 11:38 AM Christoph Hellwig <hch@...radead.org> wrote:
> > On Fri, Nov 21, 2025 at 04:17:40PM +0800, zhangshida wrote:
> > > From: Shida Zhang <zhangshida@...inos.cn>
> > >
> > > Signed-off-by: Shida Zhang <zhangshida@...inos.cn>
> > > ---
> > > block/bio.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/block/bio.c b/block/bio.c
> > > index b3a79285c27..55c2c1a0020 100644
> > > --- a/block/bio.c
> > > +++ b/block/bio.c
> > > @@ -322,7 +322,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
> > >
> > > static void bio_chain_endio(struct bio *bio)
> > > {
> > > - bio_endio(__bio_chain_endio(bio));
> > > + bio_endio(bio);
> >
> > I don't see how this can work. bio_chain_endio is called literally
> > as the result of calling bio_endio, so you recurse into that.
>
> Hmm, I don't actually see where: bio_endio() only calls
> __bio_chain_endio(), which is fine.
>
> Once bio_chain_endio() only calls bio_endio(), it can probably be
> removed in a follow-up patch.
>
> Also, loosely related, what I find slightly odd is this code in
> __bio_chain_endio():
>
> if (bio->bi_status && !parent->bi_status)
> parent->bi_status = bio->bi_status;
>
> I don't think it really matters whether or not parent->bi_status is
> already set here?
>
> Also, multiple completions can race setting bi_status, so shouldn't we
> at least have a WRITE_ONCE() here and in the other places that set
> bi_status?
>
I'm considering whether we need to add a WRITE_ONCE() in version 2
of this series.
>From my understanding, WRITE_ONCE() prevents write merging and
tearing by ensuring the write operation is performed as a single, atomic
access. For instance, it stops the compiler from splitting a 32-bit write
into multiple 8-bit writes that could be interleaved with reads from other
CPUs.
However, since we're dealing with a single-byte (u8/blk_status_t) write,
it's naturally atomic at the hardware level. The CPU won't tear a byte-sized
write into separate bit-level operations.
Therefore, we could potentially change it to::
if (bio->bi_status && !READ_ONCE(parent->bi_status))
parent->bi_status = bio->bi_status;
But as you mentioned, the check might not be critical here. So ultimately,
we can simplify it to:
if (bio->bi_status)
parent->bi_status = bio->bi_status;
Thanks,
shida
> Thanks,
> Andreas
>
Powered by blists - more mailing lists