[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d941c33e-27db-8764-3a9e-515dfb481cca@huaweicloud.com>
Date: Fri, 20 Jan 2023 12:12:50 +0100
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: Alan Stern <stern@...land.harvard.edu>,
Jonas Oberhauser <jonas.oberhauser@...wei.com>
Cc: paulmck@...nel.org, parri.andrea@...il.com, will@...nel.org,
peterz@...radead.org, boqun.feng@...il.com, npiggin@...il.com,
dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
urezki@...il.com, quic_neeraju@...cinc.com, frederic@...nel.org,
linux-kernel@...r.kernel.org, viktor@...-sws.org
Subject: Re: [PATCH] tools/memory-model: Make ppo a subrelation of po
On 1/19/2023 9:06 PM, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 04:05:38PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/19/2023 4:13 AM, Alan Stern wrote:
>>> On Wed, Jan 18, 2023 at 10:38:11PM +0100, Jonas Oberhauser wrote:
>>>> On 1/18/2023 8:52 PM, Alan Stern wrote:
>>>>> On Tue, Jan 17, 2023 at 08:31:59PM +0100, Jonas Oberhauser wrote:
>>>>>> - ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
>>>>>> - ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
>>>>>> - fencerel(After-unlock-lock) ; [M])
>>>>>> + ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M])
>>>>> Shouldn't the po case of (co | po) remain intact here?
>>>> You can leave it here, but it is already covered by two other parts: the
>>>> ordering given through ppo/hb is covered by the po-unlock-lock-po & int in
>>>> ppo, and the ordering given through pb is covered by its inclusion in
>>>> strong-order.
>>> What about the ordering given through
>>> A-cumul(strong-fence)/cumul-fence/prop/hb? I suppose that might be
>>> superseded by pb as well,
>> Indeed, in fact all of A-cumul(strong-fence) is already covered through pb.
>>> but it seems odd not to have it in hb.
>>> In general, the idea in the memory model is that hb ordering relies on
>>> the non-strong properties of fences, whereas pb relies on the properties
>>> of strong fences, and rb relies on the properties of the RCU fences.
>> I agree in the sense that all strong-ordering operations are A-cumulative
>> and not including them in A-cumul is weird.
>
>> On the other side the model becomes a tiny bit smaller and simpler when all
>> ordering of prop;strong-ordering goes through a single place (pb).
>>
>> If you want, you could think of the A-cumulativity of strong ordering
>> operations as being a consequence of their strong properties. Mathematically
> That's backward logic. Being strong isn't the reason the fences are
> A-cumulative. Indeed, the model could have been written not to assume
> that strong fences are A-cumulative.
It's not just the model, it's also how strong fences are introduced in
the documentation.
I agree though that you could decouple the notion of strong from
A-cumulativity.
But would anyone want a strong fence that is not A-cumulative?
It's a bit like asking in C11 for a barrier that has the sequential
consistency guarantee of appearing in the global order S, but doesn't
have release or acquire semantics. Yes you could define that, but would
it really make sense?
>>>>>> @@ -91,8 +89,12 @@ acyclic hb as happens-before
>>>>>> (* Write and fence propagation ordering *)
>>>>>> (****************************************)
>>>>>> -(* Propagation: Each non-rf link needs a strong fence. *)
>>>>>> -let pb = prop ; strong-fence ; hb* ; [Marked]
>>>>>> +(* Strong ordering operations *)
>>>>>> +let strong-order = strong-fence | ([M] ; po-unlock-lock-po ;
>>>>>> + [After-unlock-lock] ; po ; [M])
>>>>> This is not the same as the definition removed above. In particular,
>>>>> po-unlock-lock-po only allows one step along the locking chain -- it has
>>>>> rf where the definition above has co.
>>>> Indeed it is not, but the subsequent lock-unlock sequences are in hb*. For
>>>> this reason it can be simplified to just consider the directly following
>>>> unlock().
>>> I'm not sure that argument is right. The smp_mb__after_unlock_lock
>>> needs to go after the _last_ lock in the sequence, not after the first.
>>> So you don't have to worry about subsequent lock-unlock sequences; you
>>> have to worry about preceding lock-unlock sequences.
>> I formalized a proof of equivalence in Coq a few months ago, but I was
>> recalling the argument incorrectly from memory.
>> I think the correct argument is that the previous po-unlock-lock-po form a
>> cumul-fence*;rfe;po sequence that starts with a po-rel.
>> so any
>> prop; .... ; co ; ... ; this fence ;...
>> can be rewritten to
>> prop; cumul_fence* ; po-unlock-lock-po ; this fence ;...
>> and because the the first cumul-fence here needs to start with po-release,
>> the prop ;cumul-fence* can be merged into a larger prop, leaving
>> prop; po-unlock-lock-po ; this fence ;...
> This may be so, but I would like to point out that the memory model was
> not particularly designed to be as short or as simple as possible, but
> more to reflect transparently the intuitions that kernel programmers
> have about the ways ordering should work in the kernel. It may very
> well include redundancies as a result. I don't think that's a bad
> point.
I agree that sometimes redundancies have value. But I think having,
where possible, a kind of minimal responsibility principle where each
fence appears in as few relations as possible also has value.
I've seen that when I try to help people in my team learn to use LKMM:
they see a specific type of fence and get stuck for a while chasing one
of its uses. For example, they may chase a long prop link using the only
strong fence in the example somewhere in the middle, which will then
later turn out to be a dead-end because what they need to use is pb.
For someone who doesn't know that we never need to rely on that use to
get ordering, it basically doubles the amount of time spent looking at
the graph and chasing definitions. And for very good reasons there
already are alot of definitions even when redundancies are eliminated.
Perhaps I would say that including these redundancies is good for
understanding why the formalization makes sense, but excluding them is
better for actually using the formalization.
This includes both when looking at code while having a printout of the
model next to you, but also when trying to reason about LKMM itself
(e.g., what one might do when changes are made to LKMM and one wants to
check that they interact well with the existing parts of LKMM).
I think in the long term, increasing the usability is more important
than the obvious correctness. But maybe I'm biased because I'm mostly on
the user side of LKMM :D
Powered by blists - more mailing lists