linux-kernel - Re: [RFC][PATCH] mips: Fix arch_spin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFxBc8Phdbo44dLoAoF=eWXDqrgQzXjpj-_s_SK+aWAGag@mail.gmail.com>
Date:	Tue, 2 Feb 2016 09:30:26 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Boqun Feng <boqun.feng@...il.com>
Cc:	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Will Deacon <will.deacon@....com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Maciej W. Rozycki" <macro@...tec.com>,
	David Daney <ddaney@...iumnetworks.com>,
	Måns Rullgård <mans@...sr.com>,
	Ralf Baechle <ralf@...ux-mips.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock()

On Tue, Feb 2, 2016 at 1:34 AM, Boqun Feng <boqun.feng@...il.com> wrote:
>
> Just to be clear, what Will, Paul and I are discussing here is about
> local transitivity,

I really don't think that changes the picture.

Given that

 (a) we already mix ordering methods and there are good reasons for
it, and I'd expect transitivity only makes that more likely

 (b) we expect transitivity from the individual ordering methods

 (c) I don't think that there are any relevant CPU's that violate this anyway

I really think that not expecting that to hold for mixed accesses
would be a complete disaster. It will confuse the hell out of people.

And the basic argument really stands: we should make the memory
ordering expectations as strong as we can, given the existing relevant
architecture constraints (ie x86/arm/power).

If that then means that some other architecture might need to add
extra serialization that that architecture doesn't _want_ to add,
tough luck. I absolutely hate the fact that alpha forced us to add
that crazy read-depends barrier, and I want to discourage that a lot.

In fact, I'd be willing to strengthen our existing orderings just in
the name of sanity, and say that "rcu_dereference()" should just be an
acquire, and say that if the architecture makes that more expensive,
then who the hell cares? I have not been very happy with the "consume"
memory ordering discussions for C++. Yes, it would hurt pre-lwsync
power a bit, and it would hurt 32-bit arm, but enough that we should
have the headache of the existing semantics?

So I think our current memory orderings are potentially too _weak_,
and we sure as hell shouldn't strive to weaken them further.

I think doing "smp_read_barrier_depends()" was a mistake, but it was a
mistake driven by the fact that our memory ordering _used_ to be
barrier-centric. If alpha were to have happened today, I would say
that rather than have smp_read_barrier_depends(), we should just say
that anything that requires it should use smp_load_acquire() (or
rcu_dereference - which I think could have the same semantics), and be
done with it.

And if I would make that choice today, why isn't the right thing to
just get rid of that weak nasty thing, and convert the existing (few -
there really aren't that many) smp_read_barriers() away from that
model.

See what I'm saying? We've been pandering to weak memory ordering
before. We may have had our reasons to do so, but I think it's a
mistake. It results in code that is hard to think about.

So the fact that people worry about "rcu_dereference()" really makes
me think that one is just too weak, and we should just bite the bullet
and say it's an acquire. I'd much rather take a few barriers than make
our programming model hard to understand.

                    Linus