linux-kernel - Re: RCU vs NOHZ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YzV8M4OU2wj7L9W+@hirez.programming.kicks-ass.net>
Date:   Thu, 29 Sep 2022 13:06:27 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Joel Fernandes <joel@...lfernandes.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, Boqun Feng <boqun.feng@...il.com>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: RCU vs NOHZ

On Wed, Sep 21, 2022 at 02:36:44PM -0700, Paul E. McKenney wrote:

> commit 80fc02e80a2dfb6c7468217cff2d4494a1c4b58d
> Author: Paul E. McKenney <paulmck@...nel.org>
> Date:   Wed Sep 21 13:30:24 2022 -0700
> 
>     rcu: Let non-offloaded idle CPUs with callbacks defer tick
>     
>     When a CPU goes idle, rcu_needs_cpu() is invoked to determine whether or
>     not RCU needs the scheduler-clock tick to keep interrupting.  Right now,
>     RCU keeps the tick on for a given idle CPU if there are any non-offloaded
>     callbacks queued on that CPU.
>     
>     But if all of these callbacks are waiting for a grace period to finish,
>     there is no point in scheduling a tick before that grace period has any
>     reasonable chance of completing.  This commit therefore delays the tick
>     in the case where all the callbacks are waiting for a specific grace
>     period to elapse.  In theory, this should result in a 50-70% reduction in
>     RCU-induced scheduling-clock ticks on mostly-idle CPUs.  In practice, TBD.
>     
>     Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
>     Cc: Peter Zijlstra <peterz@...radead.org>

> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 5ec97e3f7468..47cd3b0d2a07 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -676,12 +676,33 @@ void __rcu_irq_enter_check_tick(void)
>   * scheduler-clock interrupt.
>   *
>   * Just check whether or not this CPU has non-offloaded RCU callbacks
> - * queued.
> + * queued that need immediate attention.
>   */
> -int rcu_needs_cpu(void)
> +int rcu_needs_cpu(u64 basemono, u64 *nextevt)
>  {
> -	return !rcu_segcblist_empty(&this_cpu_ptr(&rcu_data)->cblist) &&
> -		!rcu_rdp_is_offloaded(this_cpu_ptr(&rcu_data));
> +	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> +	struct rcu_segcblist *rsclp = &rdp->cblist;
> +
> +	// Disabled, empty, or offloaded means nothing to do.
> +	if (!rcu_segcblist_is_enabled(rsclp) ||
> +	    rcu_segcblist_empty(rsclp) || rcu_rdp_is_offloaded(rdp)) {
> +		*nextevt = KTIME_MAX;
> +		return 0;
> +	}

So far agreed; however, I was arguing to instead:

> +
> +	// Callbacks ready to invoke or that have not already been
> +	// assigned a grace period need immediate attention.
> +	if (!rcu_segcblist_segempty(rsclp, RCU_DONE_TAIL) ||
> +	    !rcu_segcblist_segempty(rsclp, RCU_NEXT_TAIL))
> +		return 1;
> +
> +	// There are callbacks waiting for some later grace period.
> +	// Wait for about a grace period or two for the next tick, at which
> +	// point there is high probability that this CPU will need to do some
> +	// work for RCU.
> +	*nextevt = basemono + TICK_NSEC * (READ_ONCE(jiffies_till_first_fqs) +
> +					   READ_ONCE(jiffies_till_next_fqs) + 1);
> +	return 0;
>  }

force offload whatever you have in this case and always have it return
false.

Except I don't think this is quite the right place; there's too much
that can still get in the way of stopping the tick, I would delay the
force offload to the place where we actually know we're going to stop
the tick.