linux-kernel - Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix anon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <A31BD1BD-FB53-4E5C-B8B7-44817D2BC322@joelfernandes.org>
Date:   Fri, 28 Jul 2023 15:50:23 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     paulmck@...nel.org
Cc:     Alan Stern <stern@...land.harvard.edu>,
        Will Deacon <will@...nel.org>, Jann Horn <jannh@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...uxfoundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Andrea Parri <parri.andrea@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Nicholas Piggin <npiggin@...il.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Akira Yokosawa <akiyks@...il.com>,
        Daniel Lustig <dlustig@...dia.com>
Subject: Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix anon_vma memory ordering



> On Jul 28, 2023, at 2:18 PM, Paul E. McKenney <paulmck@...nel.org> wrote:
> 
> On Fri, Jul 28, 2023 at 02:03:09PM -0400, Joel Fernandes wrote:
>>> On Fri, Jul 28, 2023 at 1:51 PM Alan Stern <stern@...land.harvard.edu> wrote:
>>> 
>>> On Fri, Jul 28, 2023 at 01:35:43PM -0400, Joel Fernandes wrote:
>>>> On Fri, Jul 28, 2023 at 8:44 AM Will Deacon <will@...nel.org> wrote:
>>>>> 
>>>>> On Thu, Jul 27, 2023 at 12:34:44PM -0400, Joel Fernandes wrote:
>>>>>>> On Jul 27, 2023, at 10:57 AM, Will Deacon <will@...nel.org> wrote:
>>>>>>> On Thu, Jul 27, 2023 at 04:39:34PM +0200, Jann Horn wrote:
>>>>>>>> if (READ_ONCE(vma->anon_vma) != NULL) {
>>>>>>>> // we now know that vma->anon_vma cannot change anymore
>>>>>>>> 
>>>>>>>> // access the same memory location again with a plain load
>>>>>>>> struct anon_vma *a = vma->anon_vma;
>>>>>>>> 
>>>>>>>> // this needs to be address-dependency-ordered against one of
>>>>>>>> // the loads from vma->anon_vma
>>>>>>>> struct anon_vma *root = a->root;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Is this fine? If it is not fine just because the compiler might
>>>>>>>> reorder the plain load of vma->anon_vma before the READ_ONCE() load,
>>>>>>>> would it be fine after adding a barrier() directly after the
>>>>>>>> READ_ONCE()?
>>>>>>> 
>>>>>>> I'm _very_ wary of mixing READ_ONCE() and plain loads to the same variable,
>>>>>>> as I've run into cases where you have sequences such as:
>>>>>>> 
>>>>>>>   // Assume *ptr is initially 0 and somebody else writes it to 1
>>>>>>>   // concurrently
>>>>>>> 
>>>>>>>   foo = *ptr;
>>>>>>>   bar = READ_ONCE(*ptr);
>>>>>>>   baz = *ptr;
>>>>>>> 
>>>>>>> and you can get foo == baz == 0 but bar == 1 because the compiler only
>>>>>>> ends up reading from memory twice.
>>>>>>> 
>>>>>>> That was the root cause behind f069faba6887 ("arm64: mm: Use READ_ONCE
>>>>>>> when dereferencing pointer to pte table"), which was very unpleasant to
>>>>>>> debug.
>>>>>> 
>>>>>> Will, Unless I am missing something fundamental, this case is different though.
>>>>>> This case does not care about fewer reads. As long as the first read is volatile, the subsequent loads (even plain)
>>>>>> should work fine, no?
>>>>>> I am not seeing how the compiler can screw that up, so please do enlighten :).
>>>>> 
>>>>> I guess the thing I'm worried about is if there is some previous read of
>>>>> 'vma->anon_vma' which didn't use READ_ONCE() and the compiler kept the
>>>>> result around in a register. In that case, 'a' could be NULL, even if
>>>>> the READ_ONCE(vma->anon_vma) returned non-NULL.
>>>> 
>>>> If I can be a bit brave enough to say -- that appears to be a compiler
>>>> bug to me. It seems that the compiler in such an instance violates the
>>>> "Sequential Consistency Per Variable" rule? I mean if it can't even
>>>> keep SCPV true for a same memory-location load (plain or not) for a
>>>> sequence of code, how can it expect the hardware to.
>>> 
>>> It's not a compiler bug.  In this example, some other thread performs a
>>> write that changes vma->anon_vma from NULL to non-NULL.  This write
>>> races with the plain reads, and compilers are not required to obey the
>>> "Sequential Consistency Per Variable" rule (or indeed, any rule) when
>>> there is a data race.
>> 
>> So you're saying the following code behavior is OK?
>> 
>> /* Say anon_vma can only ever transition from NULL to non-NULL values */
>> a = vma->anon_vma;  // Reads NULL
>> b = READ_ONCE(vma->anon_vma); // Reads non-NULL
>> c = vma->anon_vma;  // Reads NULL!!!
>> if (b) {
>>  c->some_attribute++; // Oopsie
>> }
> 
> Is there some way to obtain (a && !b) that does not involve a data race,
> and they carte blanche for the compiler to do whatever it pleases?
> I am not seeing one.
> 
> What am I missing?

Probably nothing. I think I was living briefly in a fantasy world where I
expected predictable compiler behavior on same-memory accesses
amidst data traces. It is good to come back to reality.

thanks,

 - Joel

> 
>                            Thanx, Paul