lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Jun 2024 08:32:45 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: rcu@...r.kernel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
	rostedt@...dmis.org
Subject: Re: [PATCH rcu 1/6] rcu: Remove full ordering on second EQS snapshot

On Wed, Jun 26, 2024 at 05:03:02PM +0200, Frederic Weisbecker wrote:
> Le Wed, Jun 05, 2024 at 11:44:42AM -0700, Paul E. McKenney a écrit :
> > On Wed, Jun 05, 2024 at 02:21:13PM +0200, Frederic Weisbecker wrote:
> > > Le Tue, Jun 04, 2024 at 03:26:47PM -0700, Paul E. McKenney a écrit :
> > > > From: Frederic Weisbecker <frederic@...nel.org>
> > > > 
> > > > When the grace period kthread checks the extended quiescent state
> > > > counter of a CPU, full ordering is necessary to ensure that either:
> > > > 
> > > > * If the GP kthread observes the remote target in an extended quiescent
> > > >   state, then that target must observe all accesses prior to the current
> > > >   grace period, including the current grace period sequence number, once
> > > >   it exits that extended quiescent state. Also the GP kthread must
> > > >   observe all accesses performed by the target prior it entering in
> > > >   EQS.
> > > > 
> > > > or:
> > > > 
> > > > * If the GP kthread observes the remote target NOT in an extended
> > > >   quiescent state, then the target further entering in an extended
> > > >   quiescent state must observe all accesses prior to the current
> > > >   grace period, including the current grace period sequence number, once
> > > >   it enters that extended quiescent state. Also the GP kthread later
> > > >   observing that EQS must also observe all accesses performed by the
> > > >   target prior it entering in EQS.
> > > > 
> > > > This ordering is explicitly performed both on the first EQS snapshot
> > > > and on the second one as well through the combination of a preceding
> > > > full barrier followed by an acquire read. However the second snapshot's
> > > > full memory barrier is redundant and not needed to enforce the above
> > > > guarantees:
> > > > 
> > > >     GP kthread                  Remote target
> > > >     ----                        -----
> > > >     // Access prior GP
> > > >     WRITE_ONCE(A, 1)
> > > >     // first snapshot
> > > >     smp_mb()
> > > >     x = smp_load_acquire(EQS)
> > > >                                // Access prior GP
> > > >                                WRITE_ONCE(B, 1)
> > > >                                // EQS enter
> > > >                                // implied full barrier by atomic_add_return()
> > > >                                atomic_add_return(RCU_DYNTICKS_IDX, EQS)
> > > >                                // implied full barrier by atomic_add_return()
> > > >                                READ_ONCE(A)
> > > >     // second snapshot
> > > >     y = smp_load_acquire(EQS)
> > > >     z = READ_ONCE(B)
> > > > 
> > > > If the GP kthread above fails to observe the remote target in EQS
> > > > (x not in EQS), the remote target will observe A == 1 after further
> > > > entering in EQS. Then the second snapshot taken by the GP kthread only
> > > > need to be an acquire read in order to observe z == 1.
> > > > 
> > > > Therefore remove the needless full memory barrier on second snapshot.
> > > > 
> > > > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > > > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > > > ---
> > > >  kernel/rcu/tree.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 28c7031711a3f..f07b8bff4621b 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -321,7 +321,7 @@ static bool rcu_dynticks_in_eqs(int snap)
> > > >   */
> > > >  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
> > > >  {
> > > > -	return snap != rcu_dynticks_snap(rdp->cpu);
> > > > +	return snap != ct_dynticks_cpu_acquire(rdp->cpu);
> > > 
> > > I guess I'm going to add a comment here to elaborate on the fact
> > > it relies on the ordering enforced before the first snapshot. Would
> > > you prefer a delta patch or an updated patch?
> > 
> > Either works, just tell me which you are doing when you submit the patch.
> > Either way, I will arrange for there to be a single combined commit.
> 
> Ok before I resend, how does the following comment look like?
> 
> /*
>  * The first failing snapshot is already ordered against the accesses
>  * performed by the remote CPU after it exiting idle.

s/exiting/exits/

>  * The second snapshot therefore only needs to order against accesses
>  * performed by the remote CPU prior it entering idle and therefore can
>  * solely on acquire semantics.
>  */

s/prior it entering/prior to entering/
s/solely/rely solely/

Other than those nits, looks good to me!  

								Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ