lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160523122554.GH15728@worktop.ger.corp.intel.com>
Date:	Mon, 23 May 2016 14:25:54 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Boqun Feng <boqun.feng@...il.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Manfred Spraul <manfred@...orfullife.com>,
	Waiman Long <Waiman.Long@....com>,
	Ingo Molnar <mingo@...nel.org>, ggherdovich@...e.com,
	Mel Gorman <mgorman@...hsingularity.net>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Will Deacon <will.deacon@....com>
Subject: Re: sem_lock() vs qspinlocks

On Fri, May 20, 2016 at 10:00:45AM -0700, Linus Torvalds wrote:

> So I do wonder if we should make that smp_mb() be something the
> *caller* has to do, and document rules for it. IOW, introduce a new
> spinlock primitive called "spin_lock_synchronize()", and then spinlock
> implementations that have this non-atomic behavior with an unordered
> store would do something like
> 
>     static inline void queued_spin_lock_synchronize(struct qspinlock
> *a, struct qspinlock *b)
>     {
>         smp_mb();
>     }
> 
> and then we'd document that *if* yuou need ordering guarantees between
> 
>    spin_lock(a);
>    .. spin_is_locked/spin_wait_lock(b) ..
> 
> you have to have a
> 
>     spin_lock_synchronize(a, b);
> 
> in between.

So I think I favour the explicit barrier. But my 'problem' is that we
now have _two_ different scenarios in which we need to order two
different spinlocks.

The first is the RCpc vs RCsc spinlock situation (currently only on
PowerPC). Where the spin_unlock() spin_lock() 'barier' is not
transitive.

And the second is this 'new' situation, where the store is unordered and
is not observable until a release, which is fundamentally so on PPC and
ARM64 but also possible due to lock implementation choices like with our
qspinlock, which makes it manifest even on x86.

Now, ideally we'd be able to use one barrier construct for both; but
given that, while there is overlap, they're not the same. And I'd be
somewhat reluctant to issue superfluous smp_mb()s just because; it is an
expensive instruction.

Paul has smp_mb__after_unlock_lock() for the RCpc 'upgrade'. How about
something like:

	smp_mb__after_lock()

?


OTOH; even if we document this, it is something that is easy to forget
or miss. It is not like Documentation/memory-barriers.txt is in want of
more complexity.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ