linux-kernel - Re: [PATCH 2/6] rcu: Remove superfluous full memory barrier upon first EQS snapshot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xhsmhfruhhixv.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Thu, 16 May 2024 17:31:40 +0200
From: Valentin Schneider <vschneid@...hat.com>
To: Frederic Weisbecker <frederic@...nel.org>, LKML
 <linux-kernel@...r.kernel.org>
Cc: Frederic Weisbecker <frederic@...nel.org>, "Paul E . McKenney"
 <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>, Joel Fernandes
 <joel@...lfernandes.org>, Neeraj Upadhyay <neeraj.upadhyay@....com>,
 Uladzislau Rezki <urezki@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
 rcu <rcu@...r.kernel.org>
Subject: Re: [PATCH 2/6] rcu: Remove superfluous full memory barrier upon
 first EQS snapshot

On 15/05/24 14:53, Frederic Weisbecker wrote:
> When the grace period kthread checks the extended quiescent state
> counter of a CPU, full ordering is necessary to ensure that either:
>
> * If the GP kthread observes the remote target in an extended quiescent
>   state, then that target must observe all accesses prior to the current
>   grace period, including the current grace period sequence number, once
>   it exits that extended quiescent state.
>
> or:
>
> * If the GP kthread observes the remote target NOT in an extended
>   quiescent state, then the target further entering in an extended
>   quiescent state must observe all accesses prior to the current
>   grace period, including the current grace period sequence number, once
>   it enters that extended quiescent state.
>
> This ordering is enforced through a full memory barrier placed right
> before taking the first EQS snapshot. However this is superfluous
> because the snapshot is taken while holding the target's rnp lock which
> provides the necessary ordering through its chain of
> smp_mb__after_unlock_lock().
>
> Remove the needless explicit barrier before the snapshot and put a
> comment about the implicit barrier newly relied upon here.
>
> Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> ---
>  .../Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst    | 6 +++---
>  kernel/rcu/tree.c                                          | 7 ++++++-
>  2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> index 5750f125361b..728b1e690c64 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> @@ -149,9 +149,9 @@ This case is handled by calls to the strongly ordered
>  ``atomic_add_return()`` read-modify-write atomic operation that
>  is invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry
>  time and within ``rcu_dynticks_eqs_exit()`` at idle-exit time.
> -The grace-period kthread invokes ``rcu_dynticks_snap()`` and
> -``rcu_dynticks_in_eqs_since()`` (both of which invoke
> -an ``atomic_add_return()`` of zero) to detect idle CPUs.
> +The grace-period kthread invokes first ``ct_dynticks_cpu_acquire()``
> +(preceded by a full memory barrier) and ``rcu_dynticks_in_eqs_since()``
> +(both of which rely on acquire semantics) to detect idle CPUs.
>
>  +-----------------------------------------------------------------------+
>  | **Quick Quiz**:                                                       |
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 58415cdc54f8..f5354de5644b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -773,7 +773,12 @@ static void rcu_gpnum_ovf(struct rcu_node *rnp, struct rcu_data *rdp)
>   */
>  static int dyntick_save_progress_counter(struct rcu_data *rdp)
>  {
> -	rdp->dynticks_snap = rcu_dynticks_snap(rdp->cpu);

So for PPC, which gets the smp_mb() at the lock acquisition, this is an
"obvious" redundant smp_mb().

For the other archs, per the definition of smp_mb__after_unlock_lock() it
seems implied that UNLOCK+LOCK is a full memory barrier, but I wanted to
see it explicitly stated somewhere. From a bit of spelunking below I still
think it's the case, but is there a "better" source of truth?

  01352fb81658 ("locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier")
  """
  The Linux kernel has traditionally required that an UNLOCK+LOCK pair act as a
  full memory barrier when either (1) that UNLOCK+LOCK pair was executed by the
  same CPU or task, or (2) the same lock variable was used for the UNLOCK and
  LOCK.
  """

and

  https://lore.kernel.org/all/1436789704-10086-1-git-send-email-will.deacon@arm.com/
  """
  This ordering guarantee is already provided without the barrier on
  all architectures apart from PowerPC
  """

> +	/*
> +	 * Full ordering against accesses prior current GP and also against
                                          ^^^^^
                                          prior to

> +	 * current GP sequence number is enforced by current rnp locking
> +	 * with chained smp_mb__after_unlock_lock().
> +	 */
> +	rdp->dynticks_snap = ct_dynticks_cpu_acquire(rdp->cpu);
>       if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) {
>               trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti"));
>               rcu_gpnum_ovf(rdp->mynode, rdp);
> --
> 2.44.0