[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFznCVzoc07p2=eP_5NuTKybMm8-_7+gUMNsAoOP70aYtw@mail.gmail.com>
Date: Fri, 13 Jul 2018 10:16:48 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Will Deacon <will.deacon@....com>
Cc: andrea.parri@...rulasolutions.com,
Daniel Lustig <dlustig@...dia.com>,
Peter Zijlstra <peterz@...radead.org>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Alan Stern <stern@...land.harvard.edu>,
Akira Yokosawa <akiyks@...il.com>,
Boqun Feng <boqun.feng@...il.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
Nick Piggin <npiggin@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and
remove it for ordinary release/acquire
On Fri, Jul 13, 2018 at 2:34 AM Will Deacon <will.deacon@....com> wrote:
>
> And, since we're stating preferences, I'll reiterate my preference towards:
>
> * RCsc unlock/lock
> * RCpc release/acquire
Yes, I think this would be best. We *used* to have pretty heavy-weight
locking rules for various reasons, and we relaxed them for reasons
that weren't perhaps always the right ones.
Locking is pretty heavy-weight in general, and meant to be the "I
don't really have to think about this very much" option. Then not
being serializing enough to confuse people when it allows odd behavior
(on _some_ architectures) does not sound like a great idea.
In contrast, when you do release/acquire or any of the other "I know
what I'm doing" things, I think we want the minimal serialization
implied by the very specialized op.
> * Not fussed about atomic rmws, but having them closer to RCsc would
> make it easier to implement and reason about generic locking
> implementations
I would prefer that rmw's be RCsc by default, but that there are then
"relaxed" versions of it that aren't.
For example, one common case of rmw has nothing to do with any
ordering at all: statistics gathering. It usually has absolutely zero
need for any ordering per se, and all it wants is cache coherence.
Yes, yes, the really crticial stuff we then use percpu counters for
and a lot of clever software, but there's a lot of cases where that
isn't practical or isn't _quite_ important enough.
So "atomic_add()" being RCsc sounds like a nice tight requirement, but
then architectures who can do it cheaper could have
"atomic_add_relaxed()" that has no inherent ordering at all.
But let's see what the powerpc people find about the actual
performance impact of being RCsc on locking. Real numbers for real
loads would be nice.
Linus
Powered by blists - more mailing lists