lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170731233731.32e68f6d@roar.ozlabs.ibm.com>
Date:   Tue, 1 Aug 2017 10:35:33 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     Michael Ellerman <mpe@...erman.id.au>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org, Boqun Feng <boqun.feng@...il.com>,
        Andrew Hunter <ahh@...gle.com>,
        Maged Michael <maged.michael@...il.com>, gromer@...gle.com,
        Avi Kivity <avi@...lladb.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Palmer Dabbelt <palmer@...belt.com>
Subject: Re: [RFC PATCH v2] membarrier: expedited private command

On Mon, 31 Jul 2017 23:20:59 +1000
Michael Ellerman <mpe@...erman.id.au> wrote:

> Peter Zijlstra <peterz@...radead.org> writes:
> 
> > On Fri, Jul 28, 2017 at 10:55:32AM +0200, Peter Zijlstra wrote:  
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index e9785f7aed75..33f34a201255 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -2641,8 +2641,18 @@ static struct rq *finish_task_switch(struct task_struct *prev)
> >>  	finish_arch_post_lock_switch();
> >>  
> >>  	fire_sched_in_preempt_notifiers(current);
> >> +
> >> +	/*
> >> +	 * For CONFIG_MEMBARRIER we need a full memory barrier after the
> >> +	 * rq->curr assignment. Not all architectures have one in either
> >> +	 * switch_to() or switch_mm() so we use (and complement) the one
> >> +	 * implied by mmdrop()'s atomic_dec_and_test().
> >> +	 */
> >>  	if (mm)
> >>  		mmdrop(mm);
> >> +	else if (IS_ENABLED(CONFIG_MEMBARRIER))
> >> +		smp_mb();
> >> +
> >>  	if (unlikely(prev_state == TASK_DEAD)) {
> >>  		if (prev->sched_class->task_dead)
> >>  			prev->sched_class->task_dead(prev);
> >> 
> >>   
> >  
> >> a whole bunch of architectures don't in fact need this extra barrier at all.  
> >
> > In fact, I'm fairly sure its only PPC.
> >
> > Because only ARM64 and PPC actually implement ACQUIRE/RELEASE with
> > anything other than smp_mb() (for now, Risc-V is in this same boat and
> > MIPS could be if they ever sort out their fancy barriers).
> >
> > TSO archs use a regular STORE for RELEASE, but all their atomics imply a
> > smp_mb() and there are enough around to make one happen (typically
> > mm_cpumask updates).
> >
> > Everybody else, aside from ARM64 and PPC must use smp_mb() for
> > ACQUIRE/RELEASE.
> >
> > ARM64 has a super duper barrier in switch_to().
> >
> > Which only leaves PPC stranded.. but the 'good' news is that mpe says
> > they'll probably need a barrier in switch_mm() in any case.  
> 
> I may have been sleep deprived. We have a patch, probably soon to be
> merged, which will add a smp_mb() in switch_mm() but *only* when we add
> a CPU to mm_cpumask, ie. when we run on a CPU we haven't run on before.
> 
> I'm not across membarrier enough to know if that's sufficient, but it
> seems unlikely?

Won't be sufficient, they need a barrier after assigning rq->curr.
It can be avoided when switching between threads with the same mm.

I would like to see how bad membarrier performance is if we made
that side heavy enough to avoid the barrier in context switch (e.g.,
by taking the rq locks, or using synchronize_sched_expedited -- on
a per-arch basis of course).

Is there some (realistic-ish) benchmark using membarrier we can
experiment with?

Thanks,
Nick

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ