[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1391729117.23421.4066.camel@triegel.csb>
Date: Fri, 07 Feb 2014 00:25:17 +0100
From: Torvald Riegel <triegel@...hat.com>
To: "Joseph S. Myers" <joseph@...esourcery.com>
Cc: Will Deacon <will.deacon@....com>,
Ramana Radhakrishnan <Ramana.Radhakrishnan@....com>,
David Howells <dhowells@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"mingo@...nel.org" <mingo@...nel.org>,
"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
"gcc@....gnu.org" <gcc@....gnu.org>
Subject: Re: [RFC][PATCH 0/5] arch: atomic rework
On Thu, 2014-02-06 at 22:13 +0000, Joseph S. Myers wrote:
> On Thu, 6 Feb 2014, Torvald Riegel wrote:
>
> > > It seems that nobody really
> > > agrees on exactly how the C11 atomics map to real architectural
> > > instructions on anything but the trivial architectures.
> >
> > There's certainly different ways to implement the memory model and those
> > have to be specified elsewhere, but I don't see how this differs much
> > from other things specified in the ABI(s) for each architecture.
>
> It is not clear to me that there is any good consensus understanding of
> how to specify this in an ABI, or how, given an ABI, to determine whether
> an implementation is valid.
>
> For ABIs not considering atomics / concurrency, it's well understood, for
> example, that the ABI specifies observable conditions at function call
> boundaries ("these arguments are in these registers", "the stack pointer
> has this alignment", "on return from the function, the values of these
> registers are unchanged from the values they had on entry"). It may
> sometimes specify things at other times (e.g. presence or absence of a red
> zone - whether memory beyond the stack pointer may be overwritten on an
> interrupt). But if it gives a code sequence, it's clear this is just an
> example rather than a requirement for particular code to be generated -
> any code sequence suffices if it meets the observable conditions at the
> points where code generated by one implementation may pass control to code
> generated by another implementation.
>
> When atomics are involved, you no longer have a limited set of
> well-defined points where control may pass from code generated by one
> implementation to code generated by another - the code generated by the
> two may be running concurrently.
Agreed.
> We know of certain cases
> <http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html> where there are
> choices of the mapping of atomic operations to particular instructions.
> But I'm not sure there's much evidence that these are the only ABI issues
> arising from concurrency - that there aren't any other ways in which an
> implementation may transform code, consistent with the as-if rule of ISO
> C, that may run into incompatibilities of different choices.
I can't think of other incompatibilities with high likelihood --
provided we ignore consume memory order and the handling of dependencies
(see below). But I would doubt there is a high risk of such because (1)
the data race definition should hopefully not cause subtle
incompatibilities and (2) there is clear "hand-off point" from the
compiler to a specific instruction representing an atomic access. For
example, if we have a release store, and we agree on the instructions
used for that, then compilers will have to ensure happens-before for
anything before the release store; for example, as long as stores
sequenced-before the release store are performed, it doesn't matter in
which order that happens. Subsequently, an acquire-load somewhere else
can pick this sequence of events up just by using the agreed-upon
acquire-load; like with the stores, it can order subsequent loads in any
way it sees fit, including different optimizations. That's obviously
not a formal proof, though. But it seems likely to me, at least :)
I'm more concerned about consume and dependencies because as far as I
understand the standard's requirements, dependencies need to be tracked
across function calls. Thus, we might have several compilers involved
in that, and we can't just "condense" things to happens-before, but it's
instead how and that we keep dependencies intact. Because of that, I'm
wondering whether this is actually implementable practically. (See
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448#c10)
> And even if
> those are the only issues, it's not clear there are well-understood ways
> to define the mapping from the C11 memory model to the architecture's
> model, which provide a good way to reason about whether a particular
> choice of instructions is valid according to the mapping.
I think that if we have different options, there needs to be agreement
on which to choose across the compilers, at the very least. I don't
quite know how this looks like for GCC and LLVM, for example.
> > Are you familiar with the formalization of the C11/C++11 model by Batty
> > et al.?
> > http://www.cl.cam.ac.uk/~mjb220/popl085ap-sewell.pdf
> > http://www.cl.cam.ac.uk/~mjb220/n3132.pdf
>
> These discuss, as well as the model itself, proving the validity of a
> particular choice of x86 instructions. I imagine that might be a starting
> point towards an understanding of how to describe the relevant issues in
> an ABI, and how to determine whether a choice of instructions is
> consistent with what an ABI says. But I don't get the impression this is
> yet at the level where people not doing research in the area can routinely
> read and write such ABIs and work out whether a code sequence is
> consistent with them.
It's certainly complicated stuff (IMHO). On the positive side, it's
just a few compilers, the kernel, etc. that have to deal with this, if
most programmers use the languages' memory model.
> (If an ABI says "use instruction X", then you can't use a more efficient
> X' added by a new version of the instruction set. But it can't
> necessarily be as loose as saying "use X and Y, or other instructions that
> achieve semantics when the other thread is using X or Y", because it might
> be the case that Y' interoperates with X, X' interoperates with Y, but X'
> and Y' don't interoperate with each other. I'd envisage something more
> like mapping not to instructions, but to concepts within the
> architecture's own memory model - but that requires the architecture's
> memory model to be as well defined, and probably formalized, as the C11
> one.)
Yes. The same group of researchers have also worked on formalizing the
Power model, and use this as base for a proof of the correctness of the
proposed mappings: http://www.cl.cam.ac.uk/~pes20/cppppc/
The formal approach to all this might be a complex task, but it is more
confidence-inspiring than making guesses about one standard's prose vs.
another specification's prose.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists