[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFz-gaNKxRGU-YPw3xu=T8ZxFv0v22i+R7vWoeQT0i+8Mw@mail.gmail.com>
Date: Mon, 17 Feb 2014 14:32:46 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Torvald Riegel <triegel@...hat.com>
Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Will Deacon <will.deacon@....com>,
Peter Zijlstra <peterz@...radead.org>,
Ramana Radhakrishnan <Ramana.Radhakrishnan@....com>,
David Howells <dhowells@...hat.com>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"mingo@...nel.org" <mingo@...nel.org>,
"gcc@....gnu.org" <gcc@....gnu.org>
Subject: Re: [RFC][PATCH 0/5] arch: atomic rework
On Mon, Feb 17, 2014 at 2:09 PM, Torvald Riegel <triegel@...hat.com> wrote:
> On Sat, 2014-02-15 at 11:15 -0800, Linus Torvalds wrote:
>> >
>> > if (atomic_load(&x, mo_relaxed) == 1)
>> > atomic_store(&y, 3, mo_relaxed));
>>
>> No, please don't use this idiotic example. It is wrong.
>
> It won't be useful in practice in a lot of cases, but that doesn't mean
> it's wrong. It's clearly not illegal code. It also serves a purpose: a
> simple example to reason about a few aspects of the memory model.
It's not illegal code, but i you claim that you can make that store
unconditional, it's a pointless and wrong example.
>> The fact is, if a compiler generates anything but the obvious sequence
>> (read/cmp/branch/store - where branch/store might obviously be done
>> with some other machine conditional like a predicate), the compiler is
>> wrong.
>
> Why? I've reasoned why (1) to (3) above allow in certain cases (i.e.,
> the first load always returning 1) for the branch (or other machine
> conditional) to not be emitted. So please either poke holes into this
> reasoning, or clarify that you don't in fact, contrary to what you wrote
> above, agree with (1) to (3).
The thing is, the first load DOES NOT RETURN 1. It returns whatever
that memory location contains. End of story.
Stop claiming it "can return 1".. It *never* returns 1 unless you do
the load and *verify* it, or unless the load itself can be made to go
away. And with the code sequence given, that just doesn't happen. END
OF STORY.
So your argument is *shit*. Why do you continue to argue it?
I told you how that load can go away, and you agreed. But IT CANNOT GO
AWAY any other way. You cannot claim "the compiler knows". The
compiler doesn't know. It's that simple.
>> So why do I say you are wrong, after I just gave you an example of how
>> it happens? Because my example went back to the *real* issue, and
>> there are actual real semantically meaningful details with doing
>> things like load merging.
>>
>> To give an example, let's rewrite things a bit more to use an extra variable:
>>
>> atomic_store(&x, 1, mo_relaxed);
>> a = atomic_load(&1, mo_relaxed);
>> if (a == 1)
>> atomic_store(&y, 3, mo_relaxed);
>>
>> which looks exactly the same.
>
> I'm confused. Is this a new example?
That is a new example. The important part is that it has left a
"trace" for the programmer: because 'a' contains the value, the
programmer can now look at the value later and say "oh, we know we did
a store iff a was 1"
>> This sequence:
>>
>> atomic_store(&x, 1, mo_relaxed);
>> a = atomic_load(&x, mo_relaxed);
>> atomic_store(&y, 3, mo_relaxed);
>>
>> is actually - and very seriously - buggy.
>>
>> Why? Because you have effectively split the atomic_load into two loads
>> - one for the value of 'a', and one for your 'proof' that the store is
>> unconditional.
>
> I can't follow that, because it isn't clear to me which code sequences
> are meant to belong together, and which transformations the compiler is
> supposed to make. If you would clarify that, then I can reply to this
> part.
Basically, if the compiler allows the condition of "I wrote 3 to the
y, but the programmer sees 'a' has another value than 1 later" then
the compiler is one buggy pile of shit. It fundamentally broke the
whole concept of atomic accesses. Basically the "atomic" access to 'x'
turned into two different accesses: the one that "proved" that x had
the value 1 (and caused the value 3 to be written), and the other load
that then write that other value into 'a'.
It's really not that complicated.
And this is why descriptions like this should ABSOLUTELY NOT BE
WRITTEN as "if the compiler can prove that 'x' had the value 1, it can
remove the branch". Because that IS NOT SUFFICIENT. That was not a
valid transformation of the atomic load.
The only valid transformation was the one I stated, namely to remove
the load entirely and replace it with the value written earlier in the
same execution context.
Really, why is so hard to understand?
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists