[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whGjpt77tUYRQA6TxiGERp_rSbcNhuKaWbvoJ6yXPwhhQ@mail.gmail.com>
Date: Thu, 27 Jul 2023 10:11:29 -0700
From: Linus Torvalds <torvalds@...uxfoundation.org>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Will Deacon <will@...nel.org>, Jann Horn <jannh@...gle.com>,
paulmck@...nel.org, Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Andrea Parri <parri.andrea@...il.com>,
Boqun Feng <boqun.feng@...il.com>,
Nicholas Piggin <npiggin@...il.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
Akira Yokosawa <akiyks@...il.com>,
Daniel Lustig <dlustig@...dia.com>,
Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix
anon_vma memory ordering
On Thu, 27 Jul 2023 at 08:44, Alan Stern <stern@...land.harvard.edu> wrote:
>
> This reads a little oddly, perhaps because it's a fragment from a larger
> piece of code.
Yes. As Jann already said, this is basically a preparatory step in a
much longer sequence, and the code simply wants to make sure that any
later code (possibly quite a bit later) will not see a NULL value.
I do believe it happens to be safe to use READ_ONCE() for a number of
reasons, but I will argue that we should *never* use a bare READ_ONCE
if there is any actual real question about what any memory ordering
might be.
And yes, the RCU code obviously does use READ_ONCE(), so that's why I
say "bare" - the RCU code wraps it in helper accessors with strict
semantics.
The reason I think it would be techncially *safe* to do here is that
(a) code like this:
if (READ_ONCE(..))
may end up having the conditional speculated by the CPU, and have
the actual read done without any ordering by the CPU, but we do have
*one* guarantee: the memory load instruction will be before the
conditional branch (or whatever) in the instruction stream.
So the code may be *executed* out of order, but we know the
memory load can not be moved after the conditional (whatever form that
conditional then takes) by a compiler.
(We do have various barriers like "rcu_read_unlock()" that
guarantees that the READ_ONCE() couldn't be moved lmuch later even in
the absence of the conditional, but we can ignore that).
(b) the later use of the anon_vma (that depends on the value being
stable) is *so* much later, and behind things that the compiler sees
as barriers (again, that rcu_read_unlock() acts at a minimum as a
compiler barrier) that any subsequent use would not have its load
moved down to before the READ_ONCE() in the instruction stream.
Again, this is purely a "location in the instruction stream"
ordering argument, not a "execution order" ordering argument.
And finally
(c) I do think that all our architectures have the rules that when
you read from the *same* memory location from the same CPU, the
accesses are ordered.
Now, I didn't actually find (c) spelled out anywhere, but even alpha -
the worst memory ordering ever devised - had that particular rule (the
alpha architecture manual calls it "Location Access Order").
Now, with that said, I did argue to Jann that we should use
smp_store_release and smp_load_acquire pairs here, because honestly,
the above (a)-(c) argument is too damn subtle for me, and I don't
think this particular use requires it.
With smp_load_acquire(), you don't need any of the (a)-(c) rules. The
rule is "the load is done before any subsequent memory operations".
End of story.
So while I do think READ_ONCE() is sufficient here, I actually think
that once you start going down that path you can argue that
READ_ONCE() is actually entirely unnecessary, because we also have
other ordering rules that mean that the compiler can't really do
anythinig else even *without* the READ_ONCE().
End result: I can trivially extend the (a)-(c) to say "even
READ_ONCE() isn't strictly necessary here, because even any access
tearing - which won't happen anyway - wouldn't actually change the
result.
So if we want to make it *obvious* that it's safe, we should use
smp_load_acquire().
And if we do find that there are situations where we care so much
about the (generally fairly cheap) possible additional
synchronization, and we really want to use READ_ONCE() rather than
smp_load_acquire(), I'd rather try to wrap a READ_ONCE in some kind of
better abstraction (maybe make the conditional part of the operation,
and make it clear that you are doing a "one-time test which depends on
the same-location rule".
Do we even have the same-location rule in the LKMM?
Linus
Powered by blists - more mailing lists