linux-kernel - Re: [PATCH rcu 2/6] rcu: Remove superfluous full memory barrier upon first EQS snapshot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZnwiCsor-cku3ETF@localhost.localdomain>
Date: Wed, 26 Jun 2024 16:13:30 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Neeraj upadhyay <neeraj.iitr10@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, rcu@...r.kernel.org,
	linux-kernel@...r.kernel.org, kernel-team@...a.com,
	rostedt@...dmis.org
Subject: Re: [PATCH rcu 2/6] rcu: Remove superfluous full memory barrier upon
 first EQS snapshot

Le Wed, Jun 12, 2024 at 01:57:20PM +0530, Neeraj upadhyay a écrit :
> On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > From: Frederic Weisbecker <frederic@...nel.org>
> >
> > When the grace period kthread checks the extended quiescent state
> > counter of a CPU, full ordering is necessary to ensure that either:
> >
> > * If the GP kthread observes the remote target in an extended quiescent
> >   state, then that target must observe all accesses prior to the current
> >   grace period, including the current grace period sequence number, once
> >   it exits that extended quiescent state.
> >
> > or:
> >
> > * If the GP kthread observes the remote target NOT in an extended
> >   quiescent state, then the target further entering in an extended
> >   quiescent state must observe all accesses prior to the current
> >   grace period, including the current grace period sequence number, once
> >   it enters that extended quiescent state.
> >
> > This ordering is enforced through a full memory barrier placed right
> > before taking the first EQS snapshot. However this is superfluous
> > because the snapshot is taken while holding the target's rnp lock which
> > provides the necessary ordering through its chain of
> > smp_mb__after_unlock_lock().
> >
> > Remove the needless explicit barrier before the snapshot and put a
> > comment about the implicit barrier newly relied upon here.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > ---
> >  .../Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst    | 6 +++---
> >  kernel/rcu/tree.c                                          | 7 ++++++-
> >  2 files changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > index 5750f125361b0..728b1e690c646 100644
> > --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > @@ -149,9 +149,9 @@ This case is handled by calls to the strongly ordered
> >  ``atomic_add_return()`` read-modify-write atomic operation that
> >  is invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry
> >  time and within ``rcu_dynticks_eqs_exit()`` at idle-exit time.
> > -The grace-period kthread invokes ``rcu_dynticks_snap()`` and
> > -``rcu_dynticks_in_eqs_since()`` (both of which invoke
> > -an ``atomic_add_return()`` of zero) to detect idle CPUs.
> > +The grace-period kthread invokes first ``ct_dynticks_cpu_acquire()``
> > +(preceded by a full memory barrier) and ``rcu_dynticks_in_eqs_since()``
> > +(both of which rely on acquire semantics) to detect idle CPUs.
> >
> >  +-----------------------------------------------------------------------+
> >  | **Quick Quiz**:                                                       |
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index f07b8bff4621b..1a6ef9c5c949e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -769,7 +769,12 @@ static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp)
> >   */
> >  static int dyntick_save_progress_counter(struct rcu_data *rdp)
> >  {
> > -       rdp->dynticks_snap = rcu_dynticks_snap(rdp->cpu);
> > +       /*
> > +        * Full ordering against accesses prior current GP and also against
> > +        * current GP sequence number is enforced by current rnp locking
> > +        * with chained smp_mb__after_unlock_lock().
> > +        */
> 
> It might be worth mentioning that this chained smp_mb__after_unlock_lock()
> is provided by rnp leaf node locking in rcu_gp_init() and rcu_gp_fqs_loop() ?

Right!

How about this?

    /*
     * Full ordering against accesses prior current GP and also against
     * current GP sequence number is enforced by rcu_seq_start() implicit
     * barrier and even further by smp_mb__after_unlock_lock() barriers
     * chained all the way throughout the rnp locking tree since rcu_gp_init()
     * and up to the current leaf rnp locking.
     */

Thanks.