[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01bcf3aa-6072-45e6-9149-c2cd99171454@paulmck-laptop>
Date: Thu, 27 Jul 2023 09:09:15 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Jann Horn <jannh@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...uxfoundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Suren Baghdasaryan <surenb@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Alan Stern <stern@...land.harvard.edu>,
Andrea Parri <parri.andrea@...il.com>,
Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>,
Nicholas Piggin <npiggin@...il.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
Akira Yokosawa <akiyks@...il.com>,
Daniel Lustig <dlustig@...dia.com>,
Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix
anon_vma memory ordering
On Thu, Jul 27, 2023 at 04:07:32PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 27, 2023 at 04:39:34PM +0200, Jann Horn wrote:
> > Assume that we are holding some kind of lock that ensures that the
> > only possible concurrent update to "vma->anon_vma" is that it changes
> > from a NULL pointer to a non-NULL pointer (using smp_store_release()).
> >
> >
> > if (READ_ONCE(vma->anon_vma) != NULL) {
> > // we now know that vma->anon_vma cannot change anymore
> >
> > // access the same memory location again with a plain load
> > struct anon_vma *a = vma->anon_vma;
> >
> > // this needs to be address-dependency-ordered against one of
> > // the loads from vma->anon_vma
> > struct anon_vma *root = a->root;
> > }
> >
> >
> > Is this fine? If it is not fine just because the compiler might
> > reorder the plain load of vma->anon_vma before the READ_ONCE() load,
> > would it be fine after adding a barrier() directly after the
> > READ_ONCE()?
> >
> > I initially suggested using READ_ONCE() for this, and then Linus and
> > me tried to reason it out and Linus suggested (if I understood him
> > correctly) that you could make the ugly argument that this works
> > because loads from the same location will not be reordered by the
> > hardware. So on anything other than alpha, we'd still have the
> > required address-dependency ordering because that happens for all
> > loads, even plain loads, while on alpha, the READ_ONCE() includes a
> > memory barrier. But that argument is weirdly reliant on
> > architecture-specific implementation details.
> >
> > The other option is to replace the READ_ONCE() with a
> > smp_load_acquire(), at which point it becomes a lot simpler to show
> > that the code is correct.
>
> Aren't we straining at gnats here? The context of this is handling a
> page fault, and we used to take an entire rwsem for read. I'm having
> a hard time caring about "the extra expense" of an unnecessarily broad
> barrier.
>
> Cost of an L3 cacheline miss is in the thousands of cycles. Cost of a
> barrier is ... tens?
Couldn't agree more!
Thanx, Paul
Powered by blists - more mailing lists