[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6902de65-a487-1fc4-dd9a-598c6d6275f4@huaweicloud.com>
Date: Fri, 24 May 2024 10:16:16 +0200
From: Hernan Ponce de Leon <hernan.poncedeleon@...weicloud.com>
To: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>,
Andrea Parri <parri.andrea@...il.com>
Cc: Alan Stern <stern@...land.harvard.edu>,
"Paul E. McKenney" <paulmck@...nel.org>, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org, kernel-team@...a.com, boqun.feng@...il.com,
j.alglave@....ac.uk, luc.maranget@...ia.fr,
Joel Fernandes <joel@...lfernandes.org>
Subject: Re: LKMM: Making RMW barriers explicit
On 5/23/2024 10:30 PM, Jonas Oberhauser wrote:
>
>
> Am 5/23/2024 um 6:35 PM schrieb Andrea Parri:
>>> I would phrase it more extreme: I want to get rid of the unnecessary
>>> non-standard parts of the herd representation.
>>
>> Please indulge the thought that what might appear to be "non-standard"
>> to one or one's community might appear differently to others.
>
> Certainly. And of course, the formalization of the LKMM doesn't have to
> be optimized for the WMM community, even though I suspect that they (and
> possibly the tools they develop) are the main direct consumers of the
> formalization.
>
>> Continuing with the example above, I'm pretty sure your "standard" and
>> simple idea of mb-reads and mb-writes, or even "unsuccessful" mb-reads
>> will turn up the nose of many kernel developers...
>
> I'm not sure how true that is. Firstly I'm not sure how many kernel
> developers really read the formalization and try to see what it does,
> rather than relying on tools to gain an intuition of what's going on.
>
> But let's say a kernel developer wants to make sure that their intuition
> of how cmpxchg works is matched by the formal model, e.g., they want to
> double check that the formal model is correct
> after they got some unexpected results in a herd7 litmus test.
>
> How would they go about it today? The only way is to read the OCaml
> source code, because there's no other code that obviously defines the
> behavior. At best, they would read
>
> atomic_cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
> atomic_cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
> atomic_cmpxchg(X,V,W) __cmpxchg{mb}(X,V,W)
>
> see the Acquire, Release, and Mb references in the model, and think "ok,
> so '__cmpxchg{acquire}' means I get an Acquire tag in the success case
> which gives acq_po ordering, '__cmpxchg{release}' means I get a Release
> tag in the success case which gives po_rel ordering. Therefore
> '__cmpxchg{mb}' must give me an Mb tag in the success case. That would
> give me mb ordering, but not between the store and subsequent
> operations, and that's just wrong."
>
> But of course this "understanding by analogy" is broken, because there's
> a bunch of special OCaml match expressions in herd that look only for
> release or mb and just give totally different behavior for that one
> case. Every other tag behaves exactly the same way.
>
> At worst they wouldn't even guess that this only is the success ordering
> (that's definitely what happened to us in the weak memory model
> community, but we don't matter for this argument).
>
> A better situation would be to do something like this:
>
> (*success*)(*fail*)
> atomic_cmpxchg_acquire(X,V,W) __cmpxchg {acquire} {once} (X,V,W)
> atomic_cmpxchg_release(X,V,W) __cmpxchg {release} {once} (X,V,W)
> atomic_cmpxchg(X,V,W) __cmpxchg {mb} {once} (X,V,W)
I also think this is the best option. We could change the internal
representation of __cmpxchg in herd7 to have two memory orders (*)
similar to the current internal representation of C11 CAS and get the
memory order to be used for the success/fail case from the .def instead
of having this hardcoded in the code.
(*) Not to be confused to how LKMM sees a CAS instruction.
Hernan
>
> and being explicit in the model about what the Mb tag does:
>
> | [M];fencerel(Mb);[M] (* smp_mb() and operations with Mb ordering
> such as cmpxchg() *)
> | [M];po;[R & Mb] (* reads with an Mb tag - coming from full fence
> rmws - are ordered as-if there was an smp_mb() right before the read *)
> | [W & Mb];po;[M] (* correspondingly, writes with an Mb tag are
> ordered as-if there was an smp_mb() right after the write *)
>
> And suddenly you can read the model and map it 1:1 to the intuition that
> you can treat a successful cmpxchg() (or an xchg() etc.) as-if it was
> enclosed by smp_mb().
>
> There doesn't need to be a real fence.
>
>
>> The current repre-
>> sentation for xchg() was described in the ASPLOS'18 work
>
> Do you mean the one example in Table 3?
> What about cmpxchg() or cmpxchg_acquire()?
>
>> But that's not the point, "standards" can change and certainly kernels
>> and tools do. My remark was more to point out that you're not getting
>> rid of anything...,
>
> We're definitely getting rid of some lines in herd7, that have been
> added solely for dealing with this specific case of LKMM.
>
> For example, there's some code looking like this (for a memory ordering
> tag `a`)
>
> (match a with ["release"] -> a | _ -> a_once)
>
> which specifically refers to the release tag in LKMM and we can turn
> that into
>
> a
>
> with no more reference to any LKMM tags. Of course, this is not the
> Linux community's problem, just the herd (and others who want to use the
> literal cat file of LKMM) maintainers who have to deal with it.
>
> And imagine that at some point you want to add another tag to the linux
> kernel - like for example for 'accessing an atomic in initialization
> code, so that the compiler can do optimizations like merge a bunch of
> bitwise operations'. Let's call it 'init.
>
> What will be the behavior of
>
> WRITE_INIT(X,V) __store{init}(X,V);
>
> with the current herd7? Honestly I have no clue, because there might be a
>
> (match a with ["release"] -> a | _ -> a_once)
>
> somewhere that will turn this 'init into 'once. We'd have to comb the
> herd7 codebase to know for sure (or hope that our experiments are
> sufficiently thorough to catch all cases).
>
> >you're simply proposing a different representation
>
> I wouldn't phrase it like that, since it's a representation many people
> familiar with WMM would expect, but admittedly that doesn't have to be
> the deciding factor.
>
>> asking kernel developers & maintainers to "deal with it"
>
> Deal with what, no longer having to learn OCaml to be sure that the
> LKMM's formal definition matches the description in memory_barriers.txt?
> Otherwise, I don't see anything that changes for the worse.
> With the change, people need to think for a few seconds to see that the
> explicit rules in the .cat file are a formalization of "treat xchg() as
> if it had smp_mb() before and after".
> Without the change, they need to read an OCaml codebase to see that is
> meant by
>
> atomic_xchg(X,V,W) __xchg{mb}(X,V,W)
>
> So I don't see any downsides.
>
>
> I would also support making the representation explicit in the .def
> files instead, with something like
>
> cmp_xchg(x,e,v) = { if (*x==e) { smp_mb(); int r =
> __cmp_xchg{once}(x,e,v) ; smp_mb(); return r } else { __read{once}(x) } }
>
> Then you get to keep the representation.
> Without knowing herd's internals, I have no idea of how to seriously
> specify a meta language so that herd could effectively and efficiently
> deal with it in general. But one could at least envision some specific
> syntax for cmpxchg with a code sequence for the failure case and a code
> sequence for the success case.
>
> Personally I don't prefer this option, firstly because I don't see a
> good reason for the Linux community to go their own way here. I don't
> think there's really much of a problem with saying "this is the
> intuition; this is how we formalize it" and for the two not to be
> completely identical. It happens all the time, and in this case the gap
> between "we formalize it by really having two smp_mb() appear in the
> graph if it's successful" and "we formalize it by providing the same
> ordering that the smp_mb() would have if it was there" isn't big.
> Especially for those kernel people who have enough motivation to
> actually deep dive into the formalization in the first place. But it
> makes it a lot more accessible for the WMM community, which can only
> benefit LKMM.
>
> And secondly because people will probably mostly focus on the .cat file,
> which means that it's still a bit of a booby trap to put something so
> important (and perhaps surprising) into the .def file, but at least one
> that is in the open for people who are more careful and also read the
> .def file.
>
> > I am looking forward to something more than "because we can".
>
> - it makes it easier to maintain the LKMM in the future, because you
> don't have to work around hidden transformations inside herd7
> - it makes implicit behavior explicit
> - it makes it easier to understand that the formalization matches the
> intention
> - it makes it easier to learn the LKMM from the formalization without
> having to cross-reference every bit with the informal documentation to
> avoid misunderstandings
>
>
> Have a good time,
> jonas
>
>
>
Powered by blists - more mailing lists