lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cd51b422-89f3-1856-5d3b-d6e5b0029085@marcan.st>
Date:   Tue, 16 Aug 2022 14:36:31 +0900
From:   Hector Martin <marcan@...can.st>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Will Deacon <will@...nel.org>
Cc:     Tejun Heo <tj@...nel.org>, peterz@...radead.org,
        jirislaby@...nel.org, maz@...nel.org, mark.rutland@....com,
        boqun.feng@...il.com, catalin.marinas@....com, oneukum@...e.com,
        roman.penyaev@...fitbricks.com, asahi@...ts.linux.dev,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org
Subject: Re: [PATCH] workqueue: Fix memory ordering race in queue_work*()

On 16/08/2022 14.27, Linus Torvalds wrote:
> On Mon, Aug 15, 2022 at 9:15 PM Herbert Xu <herbert@...dor.apana.org.au> wrote:
>>
>> Please revert this as test_and_set_bit was always supposed to be
>> a full memory barrier.  This is an arch bug.
> 
<snip>
> From looking at it, the asm-generic implementation is a bit
> questionable, though. In particular, it does
> 
>         if (READ_ONCE(*p) & mask)
>                 return 1;
> 
> so it's *entirely* unordered for the "bit was already set" case.

These ops are documented in Documentation/atomic_bitops.txt as being
unordered in the failure ("bit was already set" case), and that matches
the generic implementation (which arm64 uses).

On the other hand, Twitter just pointed out that contradicting
documentation exists (I believe this was the source of the third party
doc I found that claimed it's always a barrier):

include/asm-generic/bitops/instrumented-atomic.h

So either one doc and the implementation are wrong, or the other doc is
wrong.

>    --- a/include/asm-generic/bitops/atomic.h
>    +++ b/include/asm-generic/bitops/atomic.h
>    @@ -39,9 +39,6 @@ arch_test_and_set_bit(
>         unsigned long mask = BIT_MASK(nr);
> 
>         p += BIT_WORD(nr);
>    -    if (READ_ONCE(*p) & mask)
>    -            return 1;
>    -
>         old = arch_atomic_long_fetch_or(mask, (atomic_long_t *)p);
>         return !!(old & mask);
>     }
>    @@ -53,9 +50,6 @@ arch_test_and_clear_bit
>         unsigned long mask = BIT_MASK(nr);
> 
>         p += BIT_WORD(nr);
>    -    if (!(READ_ONCE(*p) & mask))
>    -            return 0;
>    -
>         old = arch_atomic_long_fetch_andnot(mask, (atomic_long_t *)p);
>         return !!(old & mask);
>     }
> 
> but the above is not just whitespace-damaged, it's entirely untested
> and based purely on me looking at that code.

That does work, I in fact tried that fix first to prove that the
early-exit was the problem.

I don't have a particularly strong opinion on which way is right here, I
just saw the implementation matched the docs (that I found) and Will
commented on the other thread implying this was all deliberate.

- Hector

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ