linux-kernel - Re: [PATCH tip/core/rcu 4/5] sys_membarrier: Add expedited option

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170727151956.o2jtblx3jzcjopfj@hirez.programming.kicks-ass.net>
Date:   Thu, 27 Jul 2017 17:19:56 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Nicholas Piggin <npiggin@...il.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        jiangshanlai@...il.com, dipankar@...ibm.com,
        akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
        josh@...htriplett.org, tglx@...utronix.de, rostedt@...dmis.org,
        dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com,
        oleg@...hat.com, will.deacon@....com,
        Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH tip/core/rcu 4/5] sys_membarrier: Add expedited option


Hi Nick,

See below,

On Thu, Jul 27, 2017 at 03:56:10PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 27, 2017 at 06:08:16AM -0700, Paul E. McKenney wrote:
> 
> > > So I think we need either switch_mm() or switch_to() to imply a full
> > > barrier for this to work, otherwise we get:
> > > 
> > >   CPU0				CPU1
> > > 
> > > 
> > >   lock rq->lock
> > >   mb
> > > 
> > >   rq->curr = A
> > > 
> > >   unlock rq->lock
> > > 
> > >   lock rq->lock
> > >   mb
> > > 
> > > 				sys_membarrier()
> > > 
> > > 				mb
> > > 
> > > 				for_each_online_cpu()
> > > 				  p = A
> > > 				  // no match no IPI
> > > 
> > > 				mb
> > >   rq->curr = B
> > > 
> > >   unlock rq->lock
> > > 
> > > 
> > > And that's bad, because now CPU0 doesn't have an MB happening _after_
> > > sys_membarrier() if B matches.
> > 
> > Yes, this looks somewhat similar to the scenario that Mathieu pointed out
> > back in 2010: https://marc.info/?l=linux-kernel&m=126349766324224&w=2
> 
> Yes. Minus the mm_cpumask() worries.
> 
> > > So without audit, I only know of PPC and Alpha not having a barrier in
> > > either switch_*().
> > > 
> > > x86 obviously has barriers all over the place, arm has a super duper
> > > heavy barrier in switch_to().
> > 
> > Agreed, if we are going to rely on ->mm, we need ordering on assignment
> > to it.
> 
> Right, Boqun provided this reordering to show the problem:
> 
>   CPU0                                CPU1
>  
>  
>   <in process X>
>   lock rq->lock
>   mb
>  
>   rq->curr = A
>  
>   unlock rq->lock
>  
>   <switch to process A>
>  
>   lock rq->lock
>   mb
>   read Y(reordered)<---+
>                        |        store to Y
>                        |
>                        |        sys_membarrier()
>                        |
>                        |        mb
>                        |
>                        |        for_each_online_cpu()
>                        |          p = A
>                        |          // no match no IPI
>                        |
>                        |        mb
>                        |
>                        |        store to X
>   rq->curr = B         |
>                        |
>   unlock rq->lock      |
>   <switch to B>        |
>   read X               |
>                        |
>   read Y --------------+

In order to make this work we need either switch_to() or switch_mm() to
provide smp_mb(). Now you're recently taken that out on PPC and I'm
thinking you're not keen to have to put it back in.

Mathieu was wondering if placing it in switch_mm() would be less onerous
on performance, thinking that address space changes are more expensive
in any case, seeing how they have a tail of cache and translation
misses. I'm thinking you're not happy either way :-)

Opinions?