[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e79ed9f-827d-4c37-b2ec-026374cf819d@huaweicloud.com>
Date: Fri, 17 Jan 2025 09:34:23 +0100
From: Hernan Ponce de Leon <hernan.poncedeleon@...weicloud.com>
To: Alan Stern <stern@...land.harvard.edu>,
Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
Cc: paulmck@...nel.org, parri.andrea@...il.com, will@...nel.org,
peterz@...radead.org, boqun.feng@...il.com, npiggin@...il.com,
dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
urezki@...il.com, quic_neeraju@...cinc.com, frederic@...nel.org,
linux-kernel@...r.kernel.org, lkmm@...ts.linux.dev
Subject: Re: [RFC] tools/memory-model: Rule out OOTA
On 1/17/2025 12:02 AM, Alan Stern wrote:
> On Thu, Jan 16, 2025 at 08:08:22PM +0100, Jonas Oberhauser wrote:
>> I'm well aware that an absolute definition of semantic dependencies is not
>> easy to give.
>
> In fact it's undecidable. No tool is ever going to be able to detect
> semantic dependencies perfectly.
>
>> Btw, I'm re-reading that paper and here are some comments:
>> - I agree with the conclusion that C++ should not try to solve OOTA other
>> than declaring that they do not exist
>> - the definition of OOTA as sdep | rfe is not the right one. You really need
>> sdep ; rfe, because you can have multiple subsequent sdep links that
>> together are not a dependency, e.g.,
>>
>> int * x[] = { &a, &a };
>> int i = b;
>> *x[i] = 1;
>>
>> here the semantic address dependency from loading b to loading x[i] and the
>> semantic address dependency from loading x[i] and storing to *x[i] do not
>> together form a semantic dependency anymore, because *x[i] is always a. So
>> this whole code can just become a=1, and with mirrored code you can get
>> a=b=1, which is an sdep | rfe cycle.
>
> We regard sdep as extending from loads (or sets of loads) to stores.
> (Perhaps the paper doesn't state this explicitly -- it should.) So an
> sdep ; sdep sequence is not possible.
>
> Nevertheless, it's arguable that in your example there is no semantic
> dependency from the load of b to the load of x[i]. Given the code
> shown, a compiler could replace the load of x[i] with the constant value
> &a, yielding simply *(&a) = 1.
>
> There are other examples which do make the point, however. For example:
>
> int a = 1, b = 1, c = 2;
> int *x[] = {&a, &b, &c};
>
> int r1 = i;
> if (r1 > 1)
> r1 = 1;
> int *r2 = x[r1];
> int r3 = *r2;
> q = r3;
>
> Here there is a semantic dependency (if you accept dependencies to
> loads) from the load of i to the load of x[r1] and from the load of *r2
> to the store to q. But overall the value stored to q is always 1, so it
> doesn't depend on the load from i.
>
>> - I did not like the phrasing that a dependency is a function of one
>> execution, especially in contrast to source code. For a fixed execution,
>> there are no dependencies because all values are fixed. It is the other
>> executions - where you could have read another value - that create the
>> dependencies.
>>
>> Perhaps it is better to say that it is not a *local* function of source
>> code, i.e., just because the same source code has a dependency in context C
>> does not mean that it has a dependency in context C'.
>
> Of course, what we meant is that whether or not there is a semantic
> dependency depends on both the source code and the execution.
>
>> In fact I would say a major gotcha of dependencies is that they are not a
>> function of even the set of all permissible (according to the standard)
>> executions.
>> That is because the compiler does not have to make every permissible
>> execution possible, only at least one.
>
> The paper includes examples of this, I believe.
>
>> If the compiler knows that among all executions that it actually makes
>> possible, some value is always 0 - even if there is a permissible execution
>> in which it is not 0 - it can still replace that value with 0. E.g.
>>
>> T1 { x = 1; x = 0; }
>> T2 { T[x] = 1; }
>>
>> ~~ merge -- makes executions where T[x] reads x==1 impossible ~>
>>
>> T1' { x = 1; x = 0; T[x] = 1; }
>>
>> ~~ replace ~>
>>
>> T1' { x = 1; x = 0; T[0] = 1; // no more dependency :( }
>>
>> which defeats any semantic dependency definition that is a function of a
>> single execution.
>
> Our wording could be improved. We simply want to make the point that
> the same source code may have a semantic dependency in one execution but
> not in another. I guess we should avoid calling it a "function" of the
> execution, to avoid implying that the execution alone is what matters.
>
>> Of course you avoid this by making it really a function of the compiler +
>> program *and* the execution, and looking at all the other possible (not just
>> permissible) executions.
>>
>> - On the one hand, that definition makes a lot of sense. On the other hand,
>> at least without the atomics=volatile restriction it would have the downside
>> that a compiler which generates just a single execution for your program can
>> say that there are no dependencies whatsoever and generate all kinds of "out
>> of thin air" behaviors.
>
> That is so. But a compiler which examines only a single thread at a
> time cannot afford to generate just a single execution, because it
> cannot know what values the loads will obtain when the full program
> runs.
>
>> I am not sure if that gets really resolved by the volatile restrictions you
>> put, but either way those seem far stronger than what one would want.
>
> It does get resolved, because treating atomics as volatile also means
> that the compiler cannot afford to generate just a single execution.
> Again, because it cannot know what values the loads will obtain at
> runtime, since volatile loads can yield any value in a non-benign
> environment.
>
>> I would say that the approach with volatile is overzealous because it tries
>> to create a "local" order solution to the problem that only requires a
>> "global" ordering solution. Since not every semantic dependency needs to
>> provide order in C++ -- only the cycle of dependencies -- it is totally ok
>> to add too many semantic dependency edges to a program, even those that are
>> not going to be exactly maintained by every compiler, as long as we can
>> ensure that globally, no dependency cycle occurs.
>
> But then how would you characterize semantic dependencies, if you want
> to allow the definition to include some dependencies that aren't
> semantic but not so many that you ever create a cycle? This sounds like
> an even worse problem than we started with!
>
>> So for example, if we merge x = y || y = x, the merge can turn it into
>> x=y=x or y=x=y (and then into an empty program), but not into a cyclic
>> dependency. So even though one side of the dependency may be violated, for
>> sake of OOTA, we could still label both sides as dependent.
>
> They _are_ both semantically dependent (in the original parallel
> version, I mean). I don't see what merging has to do with it.
>
>> For LKMM the problem is of course much easier because you have volatiles and
>> compiler barriers. Again you could maybe add incorrect semantic dependencies
>> between accesses, as long as only the really preserved ones will imply
>> ordering.
>
> I'm not particularly concerned about OOTA or semantic dependencies in
> LKMM.
>
>> So I'm not convinced that for either of the two cases you need to do a
>> compiler-specific definition of dependencies.
>
> For the C++ case, I cannot think of any other way to approach the
> problem. Nor do I see anything wrong with a compiler-specific
> definition, given how nebulous the whole idea is in the first place.
>
>> BTW, for what it's worth, Dat3M in a sense uses the clang dependencies - it
>> first allows the compiler to do its optimizations, and then verifies the
>> llvm-ir (with a more hardware-like dependency definition).
>
> What do you mean by "verifies"? What property of the llvm-ir does it
> verify?
Let me be more precise here. The Dat3M verifier has its own IR and it
can verify 3 kinds of properties:
- assertions written in the code
- lack of liveness violations (e.g., a spinloop does not hang)
- other properties written in the .cat file as "flag ~empty(r)", data
races in LKMM/C fit in this category.
Dat3M has several parsers to create its own IR: it can read the litmus
format of herd, but also C code by first converting to llvm-ir, and then
parsing this one.
In the case of verifying kernel code wrt LKMM we do some magic such that
anything related to the "LKMM concurrency API" (things like
atomic_cmpxchg) appear as function calls in the llvm-ir. This allows us
to parse this as the concrete type of event we want.
Hernan
>
>> I think something like that can be a good practical solution with fewer
>> problems than other attempts to approximate the solution.
>
> I do not like the idea of tying the definition of OOTA (which needs to
> apply to every implementation) to a particular clang compiler.
>
> Alan
Powered by blists - more mailing lists