[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <40447973-6f6b-86f7-1147-d8f20a943767@huaweicloud.com>
Date: Tue, 24 Jan 2023 21:23:02 +0100
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Jonas Oberhauser <jonas.oberhauser@...wei.com>, paulmck@...nel.org,
parri.andrea@...il.com, will@...nel.org, peterz@...radead.org,
boqun.feng@...il.com, npiggin@...il.com, dhowells@...hat.com,
j.alglave@....ac.uk, luc.maranget@...ia.fr, akiyks@...il.com,
dlustig@...dia.com, joel@...lfernandes.org, urezki@...il.com,
quic_neeraju@...cinc.com, frederic@...nel.org,
linux-kernel@...r.kernel.org, viktor@...-sws.org
Subject: Re: [PATCH] tools/memory-model: Make ppo a subrelation of po
On 1/24/2023 6:14 PM, Alan Stern wrote:
> On Tue, Jan 24, 2023 at 02:14:03PM +0100, Jonas Oberhauser wrote:
>> After mulling it over a bit in my big old head, I consider that even though
>> dropping the [W] may be shorter, it might make for the simpler model by
>> excluding lots of cases.
>> That makes me think you should do it for real in the definition of prop. And
>> not just at the very end, because in fact each cumul-fence link might come
>> from a non-A-cumulative fence. So the same argument you are giving should be
>> applied recursively.
>> Either
>>
>> prop = (overwrite & ext)? ; (cumul-fence; [W])* ; rfe?
>>
>> or integrate it directly into cumul-fence.
> I dislike this sort of argument. I understand the formal memory model
> by relating it to the informal operational model. Thus, cumul-fence
> links a write W to another event E when the fence guarantees that W will
> propagate to E's CPU before E executes.
I later wondered why it's not defined like this and realized that prop
means that it's before E executes.
> That's how the memory model
> expresses the propagation properties of these fences.
I don't think that's really a perfect match though.
For example, W ->wmb E (and thus cumul-fence) does guarantee that W
propagates to E's CPU before E executes.
But the propagation property of wmb is that W propagates to every CPU
before E propagates to that CPU.
It just so happens that the time E propagates to E's CPU is the time it
executes.
Indeed, looking at the non-strong properties of fences only, should give
rise to a relation that only says "W propagates to any CPU before E
propagates to that CPU" and that is a relation between stores. And quite
different from "W propagates to E's CPU before E executes".
I believe that relation is (cumul-fence;[W])+.
Then
X ->(overwrite&ext);(cumul-fence;[W])* E
means that there is some W co-after X which propagates to any CPU no
later than E due to the weak properties of fences along that path.
And
X ->(overwrite&ext);(cumul-fence;[W])*;rfe? E
implies that there is some W co-after X which propagates to the CPU
executing E no later than E executes. (because E observes or executes
and hence propapagated to itself a store that must propagate to E's CPU
no earlier than W).
I think this is closer to the idea of expressing the (non-strong)
propagation properties of the fences.
> I don't want to
> rule out the possibility that E is a read merely because cumul-fence
> might be followed by another, A-cumulative fence.
Perhaps you mean non-A-cumulative fence?
The A-cumulative fences (when their A-cumulativity is actually used)
already rule out reads because they use
overwrite;cumul-fence*;rfe;(the a-cumulativity)
>
>>>>> Consider: Could we remove all propagation-ordering fences from ppo
>>>>> because they are subsumed by prop? (Or is that just wrong?)
>>>> Surely not, since prop doesn't usually provide ordering by itself.
>>> Sorry, I meant the prop-related non-ppo parts of hb and pb.
>> I still don't follow :( Can you write some equations to show me what you
>> mean?
> Consider:
>
> Rx=1 Ry=1
> Wrelease Y=1 Wx=1
>
> Here we have Wx=1 ->hb* Ry=1 by (prop \ id) & int, using the fact that
> Wy=1 is an A-cumulative release fence. But we also have
>
> Wx=1 ->rfe Rx=1 ->ppo Wy=1 ->rfe Ry=1.
>
> Thus there are two distinct ways of proving that Wx=1 ->hb* Ry=1. If we
> removed the fence term from the definition of ppo (or weakened it to
> just rmb | acq), we would eliminate the second, redundant proof. Is
> this the sort of thing you think we should do?
The reason I wouldn't do something like that is that firstly, the fence
does preserve the program order, and secondly there are proofs where you
need to use that fact.
>>>>>>> In fact, I wouldn't mind removing the happens-before, propagation, and
>>>>>>> rcu axioms from LKMM entirely, replacing them with the single
>>>>>>> executes-before axiom.
>>>>>> I was planning to propose the same thing, however, I would also propose to
>>>>>> redefine hb and rb by dropping the hb/pb parts at the end of these
>>>>>> relations.
>>>>>>
>>>>>> hb = ....
>>>>>> pb = prop ; strong-fence ; [Marked]
>>>>>> rb = prop ; rcu-fence ; [Marked]
>>>>>>
>>>>>> xb = hb|pb|rb
>>>>>> acyclic xb
>>>>> I'm not so sure that's a good idea. For instance, it would require the
>>>>> definitions of rcu-link and rb to be changed from having (hb* ; pb*) to
>>>>> having (hb | pb)*.
>>>> I think that's an improvement. It's obvious that (hb | pb)* is right and so
>>>> is (pb | hb)*.
>>>> For (hb* ; pb*), the first reaction is "why do all the hb edges need to be
>>>> before the pb edges?", until one realizes that pb actually allows hb* at the
>>>> end, so in a sense this is hb* ; (pb ; hb*)*, and then one has to
>>>> understand that this means that the prop;strong-fence edges can appear any
>>>> number of times at arbitrary locations. It just seems like defining (pb |
>>>> hb)* with extra steps.
>>> This can be mentioned explicitly as a comment or in explanation.txt.
>> Ok, but why not just use (pb|hb)* and (pb|hb|rb)* and not worry about
>> having to explain anything?
>> And make the hb and rb definitions simpler at the same time?
> Do you think (pb | hb)* is simpler than pb* (as in the statement of the
> propagation axiom)?
pb+, however aren't you thinking of getting rid of the propagation axiom?
I still think (pb' | hb)+ where pb' is the simpler definition of pb is
simpler than pb*, where pb=pb';hb*.
> Besides, remember what I said about understanding the formal memory
> model in terms of the operational model. pb relates a write W to
> another event E when the strong fence guarantees that W will propagate
> to E's CPU before E executes.
I suppose to every CPU before E executes?
> If the hb* term were omitted from the definition of pb, this wouldn't be true any more. Or at least, not as
> true as it should be.
Why is that the right level of how true it should be?
doesn't W ->pb;rb E also guarantee that W will propagate to E's CPU
before E executes?
Or even just W ->pb;pb E?
Why only consider W->pb;hb E?
I'd rather think of it in terms of "this is the basic block that implies
that it W executes before E because it propagates to every CPU before E
executes, and then you can of course extend it by adding any of pb,rb,
and hb at the end to get a longer "executes before"".
Best wishes, jonas
Powered by blists - more mailing lists