[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9GVFkVRRRs5/rBd@rowland.harvard.edu>
Date: Wed, 25 Jan 2023 15:46:14 -0500
From: Alan Stern <stern@...land.harvard.edu>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>,
Andrea Parri <parri.andrea@...il.com>,
Jonas Oberhauser <jonas.oberhauser@...wei.com>,
Peter Zijlstra <peterz@...radead.org>, will <will@...nel.org>,
"boqun.feng" <boqun.feng@...il.com>, npiggin <npiggin@...il.com>,
dhowells <dhowells@...hat.com>,
"j.alglave" <j.alglave@....ac.uk>,
"luc.maranget" <luc.maranget@...ia.fr>, akiyks <akiyks@...il.com>,
dlustig <dlustig@...dia.com>, joel <joel@...lfernandes.org>,
urezki <urezki@...il.com>,
quic_neeraju <quic_neeraju@...cinc.com>,
frederic <frederic@...nel.org>,
Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus
test)
On Wed, Jan 25, 2023 at 11:46:51AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 02:08:59PM -0500, Alan Stern wrote:
> > Why do you want the implementation to forbid it? The pattern of the
> > litmus test resembles 3+3W, and you don't care whether the kernel allows
> > that pattern. Do you?
>
> Jonas asked a similar question, so I am answering you both here.
>
> With (say) a release-WRITE_ONCE() chain implementing N+2W for some
> N, it is reasonably well known that you don't get ordering, hardware
> support otwithstanding. After all, none of the Linux kernel, C, and C++
> memory models make that guarantee. In addition, the non-RCU barriers
> and accesses that you can use to create N+2W have been in very wide use
> for a very long time.
>
> Although RCU has been in use for almost as long as those non-RCU barriers,
> it has not been in wide use for anywhere near that long. So I cannot
> be so confident in ruling out some N+2W use case for RCU.
>
> Such a use case could play out as follows:
>
> 1. They try LKMM on it, see that LKMM allows it, and therefore find
> something else that works just as well. This is fine.
>
> 2. They try LKMM on it, see that LKMM allows it, but cannot find
> something else that works just as well. They complain to us,
> and we either show them how to get the same results some other
> way or adjust LKMM (and perhaps the implementations) accordingly.
> These are also fine.
>
> 3. They don't try LKMM on it, see that it works when they test it,
> and they send it upstream. The use case is entangled deeply
> enough in other code that no one spots it on review. The Linux
> kernel unconditionally prohibits the cycle. This too is fine.
>
> 4. They don't try LKMM on it, see that it works when they test it,
> and they send it upstream. The use case is entangled deeply
> enough in other code that no one spots it on review. Because RCU
> grace periods incur tens of microseconds of latency at a minimum,
> all tests (almost) always pass, just due to delays and unrelated
> accesses and memory barriers. Even in kernels built with some
> future SRCU equivalent of CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> But the Linux kernel allows the cycle when there is a new moon
> on Tuesday during a triple solar eclipse of Jupiter, a condition
> that is eventually met, and at the worst possible time and place.
>
> This is absolutely the opposite of fine.
>
> I don't want to deal with #4. So this is an RCU-maintainer use case
> that I would like to avoid. ;-)
Since it is well known that the non-RCU barriers in the Linux kernel, C,
and C++ do not enforce ordering in n+nW, and seeing as how your litmus
test relies on an smp_store_release() at one point, I think it's
reasonable to assume people won't expect it to provide ordering.
Ah, but what about a litmus test that relies solely on RCU?
rcu_read_lock Wy=2 rcu_read_lock Wv=2
Wx=2 synchronize_rcu Wu=2 synchronize_rcu
Wy=1 Wu=1 Wv=1 Wx=1
rcu_read_unlock rcu_read_unlock
exists (x=2 /\ y=2 /\ u=2 /\ v=2)
Luckily, this _is_ forbidden by the LKMM. So I think you're okay.
Alan
Powered by blists - more mailing lists