[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSMQyCJrqbIromUd@fedora>
Date: Sun, 23 Nov 2025 21:48:56 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Andreas Gruenbacher <agruenba@...hat.com>
Cc: Stephen Zhang <starzhangzsd@...il.com>, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org, nvdimm@...ts.linux.dev,
virtualization@...ts.linux.dev, linux-nvme@...ts.infradead.org,
gfs2@...ts.linux.dev, ntfs3@...ts.linux.dev,
linux-xfs@...r.kernel.org, zhangshida@...inos.cn,
Coly Li <colyli@...as.com>, linux-bcache@...r.kernel.org
Subject: Re: Fix potential data loss and corruption due to Incorrect BIO
Chain Handling
On Sat, Nov 22, 2025 at 03:56:58PM +0100, Andreas Gruenbacher wrote:
> On Sat, Nov 22, 2025 at 1:07 PM Ming Lei <ming.lei@...hat.com> wrote:
> > > static void bio_chain_endio(struct bio *bio)
> > > {
> > > bio_endio(__bio_chain_endio(bio));
> > > }
> >
> > bio_chain_endio() never gets called really, which can be thought as `flag`,
>
> That's probably where this stops being relevant for the problem
> reported by Stephen Zhang.
>
> > and it should have been defined as `WARN_ON_ONCE(1);` for not confusing people.
>
> But shouldn't bio_chain_endio() still be fixed to do the right thing
> if called directly, or alternatively, just BUG()? Warning and still
> doing the wrong thing seems a bit bizarre.
IMO calling ->bi_end_io() directly shouldn't be encouraged.
The only in-tree direct call user could be bcache, so is this reported
issue triggered on bcache?
If bcache can't call bio_endio(), I think it is fine to fix
bio_chain_endio().
>
> I also see direct bi_end_io calls in erofs_fileio_ki_complete(),
> erofs_fscache_bio_endio(), and erofs_fscache_submit_bio(), so those
> are at least confusing.
All looks FS bio(non-chained), so bio_chain_endio() shouldn't be involved
in erofs code base.
Thanks,
Ming
Powered by blists - more mailing lists