[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb6050e9-91a8-f04c-809c-7a22aed232a8@gmail.com>
Date: Thu, 20 Jul 2017 21:55:31 +0900
From: Akira Yokosawa <akiyks@...il.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Boqun Feng <boqun.feng@...il.com>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, Akira Yokosawa <akiyks@...il.com>
Subject: Re: [PATCH] documentation: Fix two-CPU control-dependency example
On 2017/07/20 14:47, Paul E. McKenney wrote:
> On Thu, Jul 20, 2017 at 09:31:41AM +0800, Boqun Feng wrote:
>> On Wed, Jul 19, 2017 at 02:56:02PM -0700, Paul E. McKenney wrote:
>>> On Thu, Jul 20, 2017 at 06:33:26AM +0900, Akira Yokosawa wrote:
>>>> On 2017/07/20 2:43, Paul E. McKenney wrote:
>>>>> On Mon, Jul 17, 2017 at 05:24:42PM +0900, Akira Yokosawa wrote:
>>>>>> >From b798b9b631e237d285aa8699da00bfb8ced33bea Mon Sep 17 00:00:00 2001
>>>>>> From: Akira Yokosawa <akiyks@...il.com>
>>>>>> Date: Mon, 17 Jul 2017 16:25:33 +0900
>>>>>> Subject: [PATCH] documentation: Fix two-CPU control-dependency example
>>>>>>
>>>>>> In commit 5646f7acc95f ("memory-barriers: Fix control-ordering
>>>>>> no-transitivity example"), the operator in "if" statement of
>>>>>> the two-CPU example was modified from ">=" to ">".
>>>>>> Now the example misses the point because there is no party
>>>>>> who will modify "x" nor "y". So each CPU performs only the
>>>>>> READ_ONCE().
>>>>>>
>>>>>> The point of this example is to use control dependency for ordering,
>>>>>> and the WRITE_ONCE() should always be executed.
>>>>>>
>>>>>> So it was correct prior to the above mentioned commit. Partial
>>>>>> revert of the commit (with context adjustments regarding other
>>>>>> changes thereafter) restores the point.
>>>>>>
>>>>>> Note that the three-CPU example demonstrating the lack of
>>>>>> transitivity stands regardless of this partial revert.
>>>>>>
>>>>>> Signed-off-by: Akira Yokosawa <akiyks@...il.com>
>>>>>
>>>>> Hello, Akira,
>>>>>
>>>>> You are quite right that many compilers would generate straightforward
>>>>> code for the code fragment below, and in that case, the assertion could
>>>>> never trigger due to either TSO or control dependencies, depending on
>>>>> the architecture this was running on.
>>>>>
>>>>> However, if the compiler was too smart for our good, it could figure
>>>>> out that "x" and "y" only take on the values zero and one, so that
>>>>> the "if" would always be taken. At that point, the compiler could
>>>>> simply ignore the "if" with the result shown below.
>>>>>
>>>>>> ---
>>>>>> Documentation/memory-barriers.txt | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>>>>> index c4ddfcd..c1ebe99 100644
>>>>>> --- a/Documentation/memory-barriers.txt
>>>>>> +++ b/Documentation/memory-barriers.txt
>>>>>> @@ -851,7 +851,7 @@ demonstrated by two related examples, with the initial values of
>>>>>> CPU 0 CPU 1
>>>>>> ======================= =======================
>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
>>>>>> - if (r1 > 0) if (r2 > 0)
>>>>>> + if (r1 >= 0) if (r2 >= 0)
>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
>>>>>>
>>>>>> assert(!(r1 == 1 && r2 == 1));
>>>>>
>>>>> Original program:
>>>>>
>>>>> CPU 0 CPU 1
>>>>> ======================= =======================
>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
>>>>> if (r1 >= 0) if (r2 >= 0)
>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
>>>>>
>>>>> assert(!(r1 == 1 && r2 == 1));
>>>>>
>>>>> Enthusiastically optimized program:
>>>>>
>>>>> CPU 0 CPU 1
>>>>> ======================= =======================
>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
>>>>>
>>>>> assert(!(r1 == 1 && r2 == 1));
>>>>>
>>>>> This optimized version could trigger the assertion.
>>>>>
>>>>> So we do need to stick with the ">" comparison.
>>>>
>>>> Well but,
>>>>
>>>> Current example:
>>>>
>>>> CPU 0 CPU 1
>>>> ======================= =======================
>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
>>>> if (r1 > 0) if (r2 > 0)
>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
>>>>
>>>> assert(!(r1 == 1 && r2 == 1));
>>>>
>>>> Such a clever compiler might be able to prove that "x" and "y"
>>>> are never modified and end up in the following:
>>>>
>>
>> Hi Akira,
>>
>> I wouldn't call that compiler a clever one, it's a broken one ;-)
>>
>> So here is the thing: READ_ONCE() is a *volatile* load, which means the
>> compiler has to generate code that actually does a load, so the values
>> of r1 and r2 depend on the loads, therefore, a sane compiler will not
>> optimize the if()s out because the volatile semantics of READ_ONCE().
>> Otherwise, I think we have a lot more to worry about than this case.
>>
>>>> CPU 0 CPU 1
>>>> ======================= =======================
>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
>>>>
>>>> assert(!(r1 == 1 && r2 == 1));
>>>>
>>>> This means it is impossible to describe this example in C,
>>>> doesn't it?
>>>
>>> That is a matter of some debate, but it has gotten increasingly more
>>> difficult to get C to do one's bidding over the decades.
>>>
>>>> What am I missing here?
>>>
>>> The compiler has to work harder in your example case, so it is probably
>>> just a matter of time. In the original with ">=", all the compiler needed
>>> to do was look at all the assignments and the initial value. In the
>>> original, it also had to do reasoning based on control dependencies
>>> (which are not yet part of the C standard).
>>>
>>> But yes, the degree to which a compiler can optimize atomics is a matter
>>> of some debate. Here is a proposal to let the developer choose:
>>>
>>
>> Hi Paul,
>>
>> I know the compiler could optimize atomics in very interesting ways, but
>> this case is about volatile, so I guess our case is still fine? ;-)
>
> Hello, Boqun,
>
> When I asked that question, the answer I got was "the compiler must
> emit the load instruction, but is under no obligation to actually use the
> value loaded".
So, it sounds like the following description found in memory-barriers.txt
just above the example of lack of transitivity:
> However, stores are not speculated. This means that ordering -is- provided
> for load-store control dependencies, as in the following example:
>
> q = READ_ONCE(a);
> if (q) {
> WRITE_ONCE(b, 1);
> }
>
> Control dependencies pair normally with other types of barriers.
> That said, please note that neither READ_ONCE() nor WRITE_ONCE()
> are optional! Without the READ_ONCE(), the compiler might combine the
> load from 'a' with other loads from 'a'. Without the WRITE_ONCE(),
> the compiler might combine the store to 'b' with other stores to 'b'.
> Either can result in highly counterintuitive effects on ordering.
>
> Worse yet, if the compiler is able to prove (say) that the value of
> variable 'a' is always non-zero, it would be well within its rights
> to optimize the original example by eliminating the "if" statement
> as follows:
>
> q = a;
> b = 1; /* BUG: Compiler and CPU can both reorder!!! */
>
> So don't leave out the READ_ONCE().
does not stand if the answer needs to be taken seriously.
The READ_ONCE() is not good enough to prevent the "if" statement
from being eliminated.
Hmm... If we do need to care about such an extreme optimization,
we should not rely on any control dependency in C source code
at least until the compiler gets educated about control dependency.
Is there some reasonable compromise?
Thanks, Akira
>
> I don't happen to like that answer, by the way. ;-)
>
> Thanx, Paul
>
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0062r1.html
>>>
>>
>> Great material to wake up mind in the morning! Thanks.
>>
>> Regards,
>> Boqun
>>
>>> What are your thoughts on this?
>>>
>>> Thanx, Paul
>>>
>>>> Thanks, Akira
>>>>
>>>>> That said, I very much welcome critical reviews of memory-barriers.txt,
>>>>> so please do feel free to continue doing that!
>>>>>
>>>>> Thanx, Paul
>>>>>
>>>>>
>>>>
>>>
>>
>
>
Powered by blists - more mailing lists