lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151016161608.GA3816@twins.programming.kicks-ass.net>
Date:	Fri, 16 Oct 2015 18:16:08 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Will Deacon <will.deacon@....com>, linux-kernel@...r.kernel.org,
	Oleg Nesterov <oleg@...hat.com>, Ingo Molnar <mingo@...nel.org>
Subject: Re: Q: schedule() and implied barriers on arm64

On Fri, Oct 16, 2015 at 09:04:22AM -0700, Paul E. McKenney wrote:
> On Fri, Oct 16, 2015 at 05:18:30PM +0200, Peter Zijlstra wrote:
> > Hi,
> > 
> > IIRC Paul relies on schedule() implying a full memory barrier with
> > strong transitivity for RCU.
> > 
> > If not, ignore this email.
> 
> Not so sure about schedule(), but definitely need strong transitivity
> for the rcu_node structure's ->lock field.  And the atomic operations
> on the rcu_dyntick structure's fields when entering or leaving the
> idle loop.
> 
> With schedule, the thread later reports the quiescent state, which
> involves acquiring the rcu_node structure's ->lock field.  So I -think-
> that the locks in the scheduler can be weakly transitive.

So I _thought_ you needed this to separate the preempt_disabled
sections. Such that rcu_note_context_switch() is guaranteed to be done
before a new preempt_disabled region starts.

But if you really only need program order guarantees for that, and deal
with everything else from your tick, then that's fine too.

Maybe some previous RCU variant relied on this?

> > If so, however, I suspect AARGH64 is borken and would need (just like
> > PPC):
> > 
> > #define smp_mb__before_spinlock()	smp_mb()
> > 
> > The problem is that schedule() (when a NO-OP) does:
> > 
> > 	smp_mb__before_spinlock();
> > 	LOCK rq->lock
> > 
> > 	clear_bit()
> > 
> > 	UNLOCK rq->lock
> > 
> > And nothing there implies a full barrier on AARGH64, since
> > smp_mb__before_spinlock() defaults to WMB, LOCK is an "ldaxr" or
> > load-acquire, UNLOCK is "stlrh" or store-release and clear_bit() isn't
> > anything.
> > 
> > Pretty much every other arch has LOCK implying a full barrier, either
> > because its strongly ordered or because it needs one for the ACQUIRE
> > semantics.
> 
> But I thought that it used a dmb in the spinlock code somewhere or
> another...

arm does, arm64 not so much.

> Well, arm64 might well need smp_mb__after_unlock_lock() to be non-empty.

Its UNLOCK+LOCK should be RCsc, so that should be good. Its just that
LOCK+UNLOCK isn't anything.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ