[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y7WdKog5K/UV4JhZ@andrea>
Date:   Wed, 4 Jan 2023 16:37:14 +0100
From:   Andrea Parri <parri.andrea@...il.com>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     Jonas Oberhauser <jonas.oberhauser@...wei.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>, will <will@...nel.org>,
        "boqun.feng" <boqun.feng@...il.com>, npiggin <npiggin@...il.com>,
        dhowells <dhowells@...hat.com>,
        "j.alglave" <j.alglave@....ac.uk>,
        "luc.maranget" <luc.maranget@...ia.fr>, akiyks <akiyks@...il.com>,
        dlustig <dlustig@...dia.com>, joel <joel@...lfernandes.org>,
        urezki <urezki@...il.com>,
        quic_neeraju <quic_neeraju@...cinc.com>,
        frederic <frederic@...nel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus
 test)
On Tue, Jan 03, 2023 at 01:56:08PM -0500, Alan Stern wrote:
> [Added LKML to the CC: list so there will be a permanent record of this 
> part of the discussion, and changed the Subject: to something more 
> descriptive of the topic at hand.]
> 
> On Tue, Jan 03, 2023 at 04:57:56PM +0000, Jonas Oberhauser wrote:
> > Happy new year everyone!
> > 
> > I'd like to circle back to the brief discussion we had about ppo \subseteq po.
> > 
> > Here's some context:
> > 
> > > > > > the preserved program order not always being a 
> > > > > > program order, lack of
> > > > 
> > > > > Where does the LKMM allow a ppo relation not to be in program order?
> > > >
> > > > When one thread releases a lock and another one takes the lock, you 
> > > > can get an mb relation between the two threads
> > > >
> > > > https://github.com/torvalds/linux/blob/master/tools/memory-model/linux
> > > > -kernel.cat#L40
> > > > 
> > > > this then turns into a ppo edge.
> > 
> > > Ah.  I suppose we should have been a little more careful about internal vs. external full barriers.  RCU barriers are also external, but the model didn't try to include them in the definition of mb; we should have done the same with unlock-lock.
> > 
> > To be more explicit, in the current LKMM, mb includes some cases of po;[UL];co;[LKW];po which also relates events between threads, and this trickles up to the ppo:
> > 
> >   let mb = ([M] ; fencerel(Mb) ; [M]) |
> > 	([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > 	([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > 	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > 	([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > 		fencerel(After-unlock-lock) ; [M])
> >   let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> >   let strong-fence = mb | gp
> >   ...
> >   let ppo = to-r | to-w | (... | strong-fence | ...) | (po-unlock-lock-po & int) // expanded for readability
> > 
> > Because of this, not every preserved program order edge is actually a program order edge that is being preserved.
> 
> Indeed, one can argue that neither the fence nor the (po-unlock-lock-po 
> & int) sub-relations should be included in ppo, since they don't reflect 
> dataflow constraints.  They could instead be added separately to the 
> definition of hb, which is the only place that uses ppo.
> 
> > My suggestion for a fix would be to move this part out of mb and strong-fence, and instead introduce a new relation strong-sync that covers synchronization also between threads.
> > 
> >   let mb = ([M] ; fencerel(Mb) ; [M]) |
> > 	([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > 	([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > 	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > - 	([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > -		fencerel(After-unlock-lock) ; [M])
> >   let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> >   let strong-fence = mb | gp
> > +  let strong-sync = strong-fence | ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > +		fencerel(After-unlock-lock) ; [M])
> > ...
> >   let ppo = to-r | to-w | (... | strong-fence | ...) | (po-unlock-lock-po & int)
> > 
> > and then use strong-sync instead of strong-fence everywhere else, e.g.
> > - let pb = prop ; strong-fence ; hb* ; [Marked]
> > + let pb = prop ; strong-sync ; hb* ; [Marked]
> > and similarly where strong-fence is being redefined and used in various later lines.
> > (In general I would prefer renaming also other *-fence relations into *-sync when they include edges between threads).
> > 
> > 
> > Note that no ordering is changed by this move.
> > Firstly, the case [M];po;[UL];po;[LKW]; fencerel(After-unlock-lock) ; [M] which is also eliminated from mb by this change  is still present in ppo through the definition ppo = ... | (po-unlock-lock-po & int).
> > Secondly, for the ordering of [M];po;[UL];co;[LKW]; fencerel(After-unlock-lock) ; [M] we can focus on the case [M];po;[UL];coe;[LKW]; fencerel(After-unlock-lock) ; [M] because the other case (coi) is covered by the previous case.
> > Ordering imposed by this case is also not lost, since every [M];po;[UL];coe;[LKW]; fencerel(After-unlock-lock) ; [M] edge also imposes a
> >     [M];po;[UL];rfe;[LKR]; fencerel(After-unlock-lock) ; [M]
> > edge which is a po-rel ; [Marked] ; rfe ; [Marked] ; acq-po edge and hence hb;hb;hb.
> > Thirdly, no new ordering is imposed by this change since every place we now order by strong-sync was previously ordered by the old strong-fence which is identical to the new strong-sync, and in all other places we changed we just (potentially) removed ordering.
> > 
> > The definition of strong-sync could also be slightly simplified to 
> >   let strong-sync = strong-fence | ([M]; po-unlock-lock-po ; [After-unlock-lock] ; po ; [M])
> > which is kind of pretty because the after-unlock-lock is now after po-unlock-lock-po.
> > 
> > What do you think?
> 
> That all sounds good to me.  However, I wonder if it might be better to 
> use "strong-order" (and similar) for the new relation name instead of 
> "strong-sync".  The idea being that fences are about ordering, not (or 
> not directly) about synchronization.
Sounds good to me too.  I'm trying to remember why we went for the LKW
event to model smp_mb__after_unlock_lock() (as opposed to the LKR event,
as suggested above/in po-unlock-lock-po).  Anyway, I currently see no 
issue with the above (we know that LKW and LKR come paired), and I think
it's good to merge the two notions of "unlock-lock pair" if possible.
  Andrea
Powered by blists - more mailing lists
 
