[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1392753432.18779.8146.camel@triegel.csb>
Date: Tue, 18 Feb 2014 20:57:12 +0100
From: Torvald Riegel <triegel@...hat.com>
To: paulmck@...ux.vnet.ibm.com
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Will Deacon <will.deacon@....com>,
Peter Zijlstra <peterz@...radead.org>,
Ramana Radhakrishnan <Ramana.Radhakrishnan@....com>,
David Howells <dhowells@...hat.com>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"mingo@...nel.org" <mingo@...nel.org>,
"gcc@....gnu.org" <gcc@....gnu.org>
Subject: Re: [RFC][PATCH 0/5] arch: atomic rework
On Tue, 2014-02-18 at 08:55 -0800, Paul E. McKenney wrote:
> On Tue, Feb 18, 2014 at 04:38:40PM +0100, Torvald Riegel wrote:
> > On Mon, 2014-02-17 at 16:18 -0800, Linus Torvalds wrote:
> > > On Mon, Feb 17, 2014 at 3:41 PM, Torvald Riegel <triegel@...hat.com> wrote:
> > > >
> > > > There's an underlying problem here that's independent from the actual
> > > > instance that you're worried about here: "no sense" is a ultimately a
> > > > matter of taste/objectives/priorities as long as the respective
> > > > specification is logically consistent.
> > >
> > > Yes. But I don't think it's "independent".
> > >
> > > Exactly *because* some people will read standards without applying
> > > "does the resulting code generation actually make sense for the
> > > programmer that wrote the code", the standard has to be pretty clear.
> > >
> > > The standard often *isn't* pretty clear. It wasn't clear enough when
> > > it came to "volatile", and yet that was a *much* simpler concept than
> > > atomic accesses and memory ordering.
> > >
> > > And most of the time it's not a big deal. But because the C standard
> > > generally tries to be very portable, and cover different machines,
> > > there tends to be a mindset that anything inherently unportable is
> > > "undefined" or "implementation defined", and then the compiler writer
> > > is basically given free reign to do anything they want (with
> > > "implementation defined" at least requiring that it is reliably the
> > > same thing).
> >
> > Yes, that's how it works in general. And this makes sense, because all
> > optimizations rely on that. Second, you can't keep something consistent
> > (eg, between compilers) if it isn't specified. So if we want stricter
> > rules, those need to be specified somewhere.
> >
> > > And when it comes to memory ordering, *everything* is basically
> > > non-portable, because different CPU's very much have different rules.
> >
> > Well, the current set of memory orders (and the memory model as a whole)
> > is portable, even though it might not allow to exploit all hardware
> > properties, and thus might perform sub-optimally in some cases.
> >
> > > I worry that that means that the standard then takes the stance that
> > > "well, compiler re-ordering is no worse than CPU re-ordering, so we
> > > let the compiler do anything". And then we have to either add
> > > "volatile" to make sure the compiler doesn't do that, or use an overly
> > > strict memory model at the compiler level that makes it all pointless.
> >
> > Using "volatile" is not a good option, I think, because synchronization
> > between threads should be orthogonal to observable output of the
> > abstract machine.
>
> Are you thinking of "volatile" -instead- of atomics? My belief is that
> given the current standard there will be times that we need to use
> "volatile" -in- -addition- to atomics.
No, I was talking about having to use "volatile" in addition to atomics
to get synchronization properties for the atomics. ISTM that this would
be unfortunate because the objective for "volatile" (ie, declaring what
is output of the abstract machine) is orthogonal to synchronization
between threads. So if we want to preserve control dependencies and get
ordering guarantees through that, I wouldn't prefer it through hacks
based on volatile.
For example, if we have a macro or helper function that contains
synchronization code, but the compiler can prove that the macro/function
is used just on data only accessible to a single thread, then the
"volatile" notation on it would prevent optimizations.
> > The current memory model might not allow to exploit all hardware
> > properties, I agree.
> >
> > But then why don't we work on how to extend it to do so? We need to
> > specify the behavior we want anyway, and this can't be independent of
> > the language semantics, so it has to be conceptually integrated with the
> > standard anyway.
> >
> > > So I really really hope that the standard doesn't give compiler
> > > writers free hands to do anything that they can prove is "equivalent"
> > > in the virtual C machine model.
> >
> > It does, but it also doesn't mean this can't be extended. So let's
> > focus on whether we can find an extension.
> >
> > > That's not how you get reliable
> > > results.
> >
> > In this general form, that's obviously a false claim.
>
> These two sentences starkly illustrate the difference in perspective
> between you two. You are talking past each other. Not sure how to fix
> this at the moment, but what else is new? ;-)
Well, we're still talking, and we are making little steps in clarifying
things, so there is progress :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists