[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170727151956.o2jtblx3jzcjopfj@hirez.programming.kicks-ass.net>
Date: Thu, 27 Jul 2017 17:19:56 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Nicholas Piggin <npiggin@...il.com>
Cc: linux-kernel@...r.kernel.org, mingo@...nel.org,
jiangshanlai@...il.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, tglx@...utronix.de, rostedt@...dmis.org,
dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com,
oleg@...hat.com, will.deacon@....com,
Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH tip/core/rcu 4/5] sys_membarrier: Add expedited option
Hi Nick,
See below,
On Thu, Jul 27, 2017 at 03:56:10PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 27, 2017 at 06:08:16AM -0700, Paul E. McKenney wrote:
>
> > > So I think we need either switch_mm() or switch_to() to imply a full
> > > barrier for this to work, otherwise we get:
> > >
> > > CPU0 CPU1
> > >
> > >
> > > lock rq->lock
> > > mb
> > >
> > > rq->curr = A
> > >
> > > unlock rq->lock
> > >
> > > lock rq->lock
> > > mb
> > >
> > > sys_membarrier()
> > >
> > > mb
> > >
> > > for_each_online_cpu()
> > > p = A
> > > // no match no IPI
> > >
> > > mb
> > > rq->curr = B
> > >
> > > unlock rq->lock
> > >
> > >
> > > And that's bad, because now CPU0 doesn't have an MB happening _after_
> > > sys_membarrier() if B matches.
> >
> > Yes, this looks somewhat similar to the scenario that Mathieu pointed out
> > back in 2010: https://marc.info/?l=linux-kernel&m=126349766324224&w=2
>
> Yes. Minus the mm_cpumask() worries.
>
> > > So without audit, I only know of PPC and Alpha not having a barrier in
> > > either switch_*().
> > >
> > > x86 obviously has barriers all over the place, arm has a super duper
> > > heavy barrier in switch_to().
> >
> > Agreed, if we are going to rely on ->mm, we need ordering on assignment
> > to it.
>
> Right, Boqun provided this reordering to show the problem:
>
> CPU0 CPU1
>
>
> <in process X>
> lock rq->lock
> mb
>
> rq->curr = A
>
> unlock rq->lock
>
> <switch to process A>
>
> lock rq->lock
> mb
> read Y(reordered)<---+
> | store to Y
> |
> | sys_membarrier()
> |
> | mb
> |
> | for_each_online_cpu()
> | p = A
> | // no match no IPI
> |
> | mb
> |
> | store to X
> rq->curr = B |
> |
> unlock rq->lock |
> <switch to B> |
> read X |
> |
> read Y --------------+
In order to make this work we need either switch_to() or switch_mm() to
provide smp_mb(). Now you're recently taken that out on PPC and I'm
thinking you're not keen to have to put it back in.
Mathieu was wondering if placing it in switch_mm() would be less onerous
on performance, thinking that address space changes are more expensive
in any case, seeing how they have a tail of cache and translation
misses. I'm thinking you're not happy either way :-)
Opinions?
Powered by blists - more mailing lists