linux-kernel - Re: [RFC] tools/memory-model: Rule out OOTA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e79ed9f-827d-4c37-b2ec-026374cf819d@huaweicloud.com>
Date: Fri, 17 Jan 2025 09:34:23 +0100
From: Hernan Ponce de Leon <hernan.poncedeleon@...weicloud.com>
To: Alan Stern <stern@...land.harvard.edu>,
 Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
Cc: paulmck@...nel.org, parri.andrea@...il.com, will@...nel.org,
 peterz@...radead.org, boqun.feng@...il.com, npiggin@...il.com,
 dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
 akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
 urezki@...il.com, quic_neeraju@...cinc.com, frederic@...nel.org,
 linux-kernel@...r.kernel.org, lkmm@...ts.linux.dev
Subject: Re: [RFC] tools/memory-model: Rule out OOTA

On 1/17/2025 12:02 AM, Alan Stern wrote:
> On Thu, Jan 16, 2025 at 08:08:22PM +0100, Jonas Oberhauser wrote:
>> I'm well aware that an absolute definition of semantic dependencies is not
>> easy to give.
> 
> In fact it's undecidable.  No tool is ever going to be able to detect
> semantic dependencies perfectly.
> 
>> Btw, I'm re-reading that paper and here are some comments:
>> - I agree with the conclusion that C++ should not try to solve OOTA other
>> than declaring that they do not exist
>> - the definition of OOTA as sdep | rfe is not the right one. You really need
>> sdep ; rfe, because you can have multiple subsequent sdep links that
>> together are not a dependency, e.g.,
>>
>> int * x[] = { &a, &a };
>> int i = b;
>> *x[i] = 1;
>>
>> here the semantic address dependency from loading b to loading x[i] and the
>> semantic address dependency from loading x[i] and storing to *x[i] do not
>> together form a semantic dependency anymore, because *x[i] is always a. So
>> this whole code can just become a=1, and with mirrored code you can get
>> a=b=1, which is an sdep | rfe cycle.
> 
> We regard sdep as extending from loads (or sets of loads) to stores.
> (Perhaps the paper doesn't state this explicitly -- it should.)  So an
> sdep ; sdep sequence is not possible.
> 
> Nevertheless, it's arguable that in your example there is no semantic
> dependency from the load of b to the load of x[i].  Given the code
> shown, a compiler could replace the load of x[i] with the constant value
> &a, yielding simply *(&a) = 1.
> 
> There are other examples which do make the point, however.  For example:
> 
> int a = 1, b = 1, c = 2;
> int *x[] = {&a, &b, &c};
> 
> int r1 = i;
> if (r1 > 1)
> 	r1 = 1;
> int *r2 = x[r1];
> int r3 = *r2;
> q = r3;
> 
> Here there is a semantic dependency (if you accept dependencies to
> loads) from the load of i to the load of x[r1] and from the load of *r2
> to the store to q.  But overall the value stored to q is always 1, so it
> doesn't depend on the load from i.
> 
>> - I did not like the phrasing that a dependency is a function of one
>> execution, especially in contrast to source code. For a fixed execution,
>> there are no dependencies because all values are fixed. It is the other
>> executions - where you could have read another value - that create the
>> dependencies.
>>
>> Perhaps it is better to say that it is not a *local* function of source
>> code, i.e., just because the same source code has a dependency in context C
>> does not mean that it has a dependency in context C'.
> 
> Of course, what we meant is that whether or not there is a semantic
> dependency depends on both the source code and the execution.
> 
>> In fact I would say a major gotcha of dependencies is that they are not a
>> function of even the set of all permissible (according to the standard)
>> executions.
>> That is because the compiler does not have to make every permissible
>> execution possible, only at least one.
> 
> The paper includes examples of this, I believe.
> 
>> If the compiler knows that among all executions that it actually makes
>> possible, some value is always 0 - even if there is a permissible execution
>> in which it is not 0 - it can still replace that value with 0. E.g.
>>
>>    T1 { x = 1; x = 0; }
>>    T2 { T[x] = 1; }
>>
>>     ~~ merge -- makes executions where T[x] reads x==1 impossible ~>
>>
>>    T1' { x = 1; x = 0; T[x] = 1; }
>>
>>     ~~ replace ~>
>>
>>    T1' { x = 1; x = 0; T[0] = 1; // no more dependency :( }
>>
>> which defeats any semantic dependency definition that is a function of a
>> single execution.
> 
> Our wording could be improved.  We simply want to make the point that
> the same source code may have a semantic dependency in one execution but
> not in another.  I guess we should avoid calling it a "function" of the
> execution, to avoid implying that the execution alone is what matters.
> 
>> Of course you avoid this by making it really a function of the compiler +
>> program *and* the execution, and looking at all the other possible (not just
>> permissible) executions.
>>
>> - On the one hand, that definition makes a lot of sense. On the other hand,
>> at least without the atomics=volatile restriction it would have the downside
>> that a compiler which generates just a single execution for your program can
>> say that there are no dependencies whatsoever and generate all kinds of "out
>> of thin air" behaviors.
> 
> That is so.  But a compiler which examines only a single thread at a
> time cannot afford to generate just a single execution, because it
> cannot know what values the loads will obtain when the full program
> runs.
> 
>> I am not sure if that gets really resolved by the volatile restrictions you
>> put, but either way those seem far stronger than what one would want.
> 
> It does get resolved, because treating atomics as volatile also means
> that the compiler cannot afford to generate just a single execution.
> Again, because it cannot know what values the loads will obtain at
> runtime, since volatile loads can yield any value in a non-benign
> environment.
> 
>> I would say that the approach with volatile is overzealous because it tries
>> to create a "local" order solution to the problem that only requires a
>> "global" ordering solution. Since not every semantic dependency needs to
>> provide order in C++ -- only the cycle of dependencies -- it is totally ok
>> to add too many semantic dependency edges to a program, even those that are
>> not going to be exactly maintained by every compiler, as long as we can
>> ensure that globally, no dependency cycle occurs.
> 
> But then how would you characterize semantic dependencies, if you want
> to allow the definition to include some dependencies that aren't
> semantic but not so many that you ever create a cycle?  This sounds like
> an even worse problem than we started with!
> 
>> So for example, if we merge  x = y || y = x,  the merge can turn it into
>> x=y=x or y=x=y (and then into an empty program), but not into a cyclic
>> dependency. So even though one side of the dependency may be violated, for
>> sake of OOTA, we could still label both sides as dependent.
> 
> They _are_ both semantically dependent (in the original parallel
> version, I mean).  I don't see what merging has to do with it.
> 
>> For LKMM the problem is of course much easier because you have volatiles and
>> compiler barriers. Again you could maybe add incorrect semantic dependencies
>> between accesses, as long as only the really preserved ones will imply
>> ordering.
> 
> I'm not particularly concerned about OOTA or semantic dependencies in
> LKMM.
> 
>> So I'm not convinced that for either of the two cases you need to do a
>> compiler-specific definition of dependencies.
> 
> For the C++ case, I cannot think of any other way to approach the
> problem.  Nor do I see anything wrong with a compiler-specific
> definition, given how nebulous the whole idea is in the first place.
> 
>> BTW, for what it's worth, Dat3M in a sense uses the clang dependencies - it
>> first allows the compiler to do its optimizations, and then verifies the
>> llvm-ir (with a more hardware-like dependency definition).
> 
> What do you mean by "verifies"?  What property of the llvm-ir does it
> verify?

Let me be more precise here. The Dat3M verifier has its own IR and it 
can verify 3 kinds of properties:

- assertions written in the code
- lack of liveness violations (e.g., a spinloop does not hang)
- other properties written in the .cat file as "flag ~empty(r)", data 
races in LKMM/C fit in this category.

Dat3M has several parsers to create its own IR: it can read the litmus 
format of herd, but also C code by first converting to llvm-ir, and then 
parsing this one.

In the case of verifying kernel code wrt LKMM we do some magic such that 
anything related to the "LKMM concurrency API" (things like 
atomic_cmpxchg) appear as function calls in the llvm-ir. This allows us 
to parse this as the concrete type of event we want.

Hernan
> 
>> I think something like that can be a good practical solution with fewer
>> problems than other attempts to approximate the solution.
> 
> I do not like the idea of tying the definition of OOTA (which needs to
> apply to every implementation) to a particular clang compiler.
> 
> Alan