linux-kernel - Re: [PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHc6FU6B6ip8e-+VXaAiPN+oqJTW2Tuoh0Vv-E96Baf2SSbt7w@mail.gmail.com>
Date: Tue, 2 Dec 2025 22:15:19 +0100
From: Andreas Gruenbacher <agruenba@...hat.com>
To: Christoph Hellwig <hch@....de>
Cc: zhangshida <starzhangzsd@...il.com>, Johannes.Thumshirn@....com, ming.lei@...hat.com, 
	hsiangkao@...ux.alibaba.com, csander@...estorage.com, colyli@...as.com, 
	linux-block@...r.kernel.org, linux-bcache@...r.kernel.org, 
	linux-kernel@...r.kernel.org, zhangshida@...inos.cn
Subject: Re: [PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain_endio

On Tue, Dec 2, 2025 at 6:48 AM Christoph Hellwig <hch@....de> wrote:
> On Mon, Dec 01, 2025 at 02:07:07PM +0100, Andreas Gruenbacher wrote:
> > On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@...radead.org> wrote:
> > > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote:
> > > > > -       if (bio->bi_status && !parent->bi_status)
> > > > > -               parent->bi_status = bio->bi_status;
> > > > > +       if (bio->bi_status)
> > > > > +               cmpxchg(&parent->bi_status, 0, bio->bi_status);
> > > >
> > > > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > > > the chained bios, bi_status is initialized to 0, and it is only set
> > > > again (to a non-0 value) when a failure occurs. When there are
> > > > multiple failures, we only need to make sure that one of those
> > > > failures is eventually reported, but for that, a simple assignment is
> > > > enough here.
> > >
> > > A simple assignment doesn't guarantee atomicy.
> >
> > Well, we've already discussed that bi_status is a single byte and so
> > tearing won't be an issue. Otherwise, WRITE_ONCE() would still be
> > enough here.
>
> No.  At least older alpha can tear byte updates as they need a
> read-modify-write cycle.

I know this used to be a thing in the past, but to see that none of
that is relevant anymore today, have a look at where [*] quotes the
C11 standard:

        memory location
                either an object of scalar type, or a maximal sequence
                of adjacent bit-fields all having nonzero width

                NOTE 1: Two threads of execution can update and access
                separate memory locations without interfering with
                each other.

                NOTE 2: A bit-field and an adjacent non-bit-field member
                are in separate memory locations. The same applies
                to two bit-fields, if one is declared inside a nested
                structure declaration and the other is not, or if the two
                are separated by a zero-length bit-field declaration,
                or if they are separated by a non-bit-field member
                declaration. It is not safe to concurrently update two
                bit-fields in the same structure if all members declared
                between them are also bit-fields, no matter what the
                sizes of those intervening bit-fields happen to be.

[*] Documentation/memory-barriers.txt

> But even on normal x86 the check and the update would be racy.

There is no check and update (RMW), though. Quoting what I wrote
earlier in this thread:

On Mon, Dec 1, 2025 at 11:22 AM Andreas Gruenbacher <agruenba@...hat.com> wrote:
> Hmm. I don't think cmpxchg() actually is of any value here: for all
> the chained bios, bi_status is initialized to 0, and it is only set
> again (to a non-0 value) when a failure occurs. When there are
> multiple failures, we only need to make sure that one of those
> failures is eventually reported, but for that, a simple assignment is
> enough here. The cmpxchg() won't guarantee that a specific error value
> will survive; it all still depends on the timing. The cmpxchg() only
> makes it look like something special is happening here with respect to
> ordering.

So with or without the cmpxchg(), if there are multiple errors, we
won't know which bi_status code will survive, but we do know that we
will end up with one of those error codes.

Andreas