linux-kernel - Re: [PATCH] tools/memory-model: Make ppo a subrelation of po

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y8hN7vs/w8LhMasT@rowland.harvard.edu>
Date:   Wed, 18 Jan 2023 14:52:14 -0500
From:   Alan Stern <stern@...land.harvard.edu>
To:     Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
Cc:     paulmck@...nel.org, parri.andrea@...il.com, will@...nel.org,
        peterz@...radead.org, boqun.feng@...il.com, npiggin@...il.com,
        dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
        akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
        urezki@...il.com, quic_neeraju@...cinc.com, frederic@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] tools/memory-model: Make ppo a subrelation of po

On Tue, Jan 17, 2023 at 08:31:59PM +0100, Jonas Oberhauser wrote:
> As stated in the documentation and implied by its name, the ppo
> (preserved program order) relation is intended to link po-earlier
> to po-later instructions under certain conditions.  However, a
> corner case currently allows instructions to be linked by ppo that
> are not executed by the same thread, i.e., instructions are being
> linked that have no po relation.
> 
> This happens due to the mb/strong-fence relations, which (as one
> case) provide order when locks are passed between threads followed
> by an smp_mb__after_unlock_lock() fence.  This is illustrated in
> the following litmus test (as can be seen when using herd7 with
> `doshow ppo`):
> 
> P0(int *x, int *y)
> {
>     spin_lock(x);
>     spin_unlock(x);
> }
> 
> P1(int *x, int *y)
> {
>     spin_lock(x);
>     smp_mb__after_unlock_lock();
>     *y = 1;
> }
> 
> The ppo relation will link P0's spin_lock(x) and P1's *y=1,
> because P0 passes a lock to P1 which then uses this fence.
> 
> The patch makes ppo a subrelation of po by eliminating this
> possibility from mb and strong-fence, and instead introduces the
> notion of strong ordering operations, which are allowed to link
> events of distinct threads.
> 
> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
> ---

I'm just going to comment on the changes to the cat file.  I'll review 
the documentation changes later.

> diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
> index 07f884f9b2bf..1d91edbc10fe 100644
> --- a/tools/memory-model/linux-kernel.cat
> +++ b/tools/memory-model/linux-kernel.cat
> @@ -36,9 +36,7 @@ let wmb = [W] ; fencerel(Wmb) ; [W]
>  let mb = ([M] ; fencerel(Mb) ; [M]) |
>  	([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
>  	([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> -	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> -	([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> -		fencerel(After-unlock-lock) ; [M])
> +	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M])

Shouldn't the po case of (co | po) remain intact here?

>  let gp = po ; [Sync-rcu | Sync-srcu] ; po?
>  let strong-fence = mb | gp
>  
> @@ -91,8 +89,12 @@ acyclic hb as happens-before
>  (* Write and fence propagation ordering *)
>  (****************************************)
>  
> -(* Propagation: Each non-rf link needs a strong fence. *)
> -let pb = prop ; strong-fence ; hb* ; [Marked]
> +(* Strong ordering operations *)
> +let strong-order = strong-fence | ([M] ; po-unlock-lock-po ;
> +		[After-unlock-lock] ; po ; [M])

This is not the same as the definition removed above.  In particular, 
po-unlock-lock-po only allows one step along the locking chain -- it has 
rf where the definition above has co.

> +
> +(* Propagation: Each non-rf link needs a strong ordering operation. *)
> +let pb = prop ; strong-order ; hb* ; [Marked]
>  acyclic pb as propagation
>  
>  (*******)
> @@ -141,7 +143,7 @@ let rec rcu-order = rcu-gp | srcu-gp |
>  	(rcu-order ; rcu-link ; rcu-order)
>  let rcu-fence = po ; rcu-order ; po?
>  let fence = fence | rcu-fence

It would be nice here to have a separate term for a potentially 
cross-CPU fence.

In fact, why don't we make a concerted effort to straighten out the 
terminology more fully?  Now seems like a good time to do it.

To begin with, let's be more careful about the difference between an 
order-inducing object (an event or pair of events) and the relation of 
being ordered by such an object.  For instance, given:

	A: WRITE_ONCE(x, 1);
	B: smp_mb();
	C: r1 = READ_ONCE(y);

then B is an order-inducing object (a memory barrier), and (A,C) is a 
pair of events ordered by that object.  In general, an order is related 
to an order-inducing object by:

	order = po ; [order-inducing object] ; po

with suitable modifications for things like smp_store_release where 
one of the events being ordered _is_ the order-inducing event.

So for example, we could consistently refer to all order-inducing events 
as either barriers or fences, and all order-reflecting relations as 
orders.  This would require widespread changes to the .cat file, but I 
think it would be worthwhile.

(Treating "barrier" and "fence" as synonyms seems to be too deeply 
entrenched to try and fight against.)

Once that is straightened out, we can distinguish between fences or 
orders that are weak vs. strong.  And then we can divide up strong 
fences/orders into single-CPU vs. cross-CPU, if we want to.

How does that sound?

Alan

> -let strong-fence = strong-fence | rcu-fence
> +let strong-order = strong-order | rcu-fence
>  
>  (* rb orders instructions just as pb does *)
>  let rb = prop ; rcu-fence ; hb* ; pb* ; [Marked]
> @@ -169,7 +171,7 @@ flag ~empty mixed-accesses as mixed-accesses
>  (* Executes-before and visibility *)
>  let xbstar = (hb | pb | rb)*
>  let vis = cumul-fence* ; rfe? ; [Marked] ;
> -	((strong-fence ; [Marked] ; xbstar) | (xbstar & int))
> +	((strong-order ; [Marked] ; xbstar) | (xbstar & int))
>  
>  (* Boundaries for lifetimes of plain accesses *)
>  let w-pre-bounded = [Marked] ; (addr | fence)?
> @@ -180,9 +182,9 @@ let r-post-bounded = (nonrw-fence | ([~Noreturn] ; fencerel(Rmb) ; [R4rmb]))? ;
>  	[Marked]
>  
>  (* Visibility and executes-before for plain accesses *)
> -let ww-vis = fence | (strong-fence ; xbstar ; w-pre-bounded) |
> +let ww-vis = fence | (strong-order ; xbstar ; w-pre-bounded) |
>  	(w-post-bounded ; vis ; w-pre-bounded)
> -let wr-vis = fence | (strong-fence ; xbstar ; r-pre-bounded) |
> +let wr-vis = fence | (strong-order ; xbstar ; r-pre-bounded) |
>  	(w-post-bounded ; vis ; r-pre-bounded)
>  let rw-xbstar = fence | (r-post-bounded ; xbstar ; w-pre-bounded)
>  
> -- 
> 2.17.1
>