lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aU24p7he1T63Qeke@pavilion.home>
Date: Thu, 25 Dec 2025 23:20:23 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Joel Fernandes <joelagnelf@...dia.com>
Cc: linux-kernel@...r.kernel.org, "Paul E. McKenney" <paulmck@...nel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Uladzislau Rezki <urezki@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang@...ux.dev>, rcu@...r.kernel.org
Subject: Re: [PATCH RFC] rcu/nocb: Remove unnecessary WakeOvfIsDeferred wake
 path

Le Thu, Dec 25, 2025 at 02:44:50AM -0500, Joel Fernandes a écrit :
> The WakeOvfIsDeferred code path in __call_rcu_nocb_wake() attempts to
> wake rcuog when the callback count exceeds qhimark and callbacks aren't
> done with their GP (newly queued or awaiting GP). However, a lot of
> testing proves this wake is always redundant or useless.
> 
> In the flooding case, rcuog is always waiting for a GP to finish. So
> waking up the rcuog thread is pointless. The timer wakeup adds overhead,
> rcuog simply wakes up and goes back to sleep achieving nothing.
> 
> This path also adds a full memory barrier, and additional timer expiry
> modifications unnecessarily.
> 
> The root cause is that WakeOvfIsDeferred fires when
> !rcu_segcblist_ready_cbs() (GP not complete), but waking rcuog cannot
> accelerate GP completion.
> 
> This commit therefore removes this path, which also adding some rdp
> counters to ensure we don't have lost wake ups.

There should be two patches: one that removes the useless path and the
other that adds the debugging.

> 
> Tested with rcutorture scenarios: TREE01, TREE05, TREE08 (all NOCB
> configurations) - all pass. Also stress tested using a kernel module
> that floods call_rcu() to trigger the overload conditions and made the
> observations confirming the findings.
> 
> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>

Cool! Just a few comments:

> @@ -549,24 +546,26 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  	lazy_len = READ_ONCE(rdp->lazy_len);
>  	if (was_alldone) {
>  		rdp->qlen_last_fqs_check = len;
> +		rdp->nocb_gp_wake_attempt = true;
> +		rcu_nocb_unlock(rdp);
>  		// Only lazy CBs in bypass list
>  		if (lazy_len && bypass_len == lazy_len) {
> -			rcu_nocb_unlock(rdp);
>  			wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_LAZY,
>  					   TPS("WakeLazy"));
>  		} else if (!irqs_disabled_flags(flags)) {
>  			/* ... if queue was empty ... */
> -			rcu_nocb_unlock(rdp);
>  			wake_nocb_gp(rdp, false);
>  			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>  					    TPS("WakeEmpty"));
>  		} else {
> -			rcu_nocb_unlock(rdp);
>  			wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE,
>  					   TPS("WakeEmptyIsDeferred"));
>  		}
> +
> +		return;
>  	} else if (len > rdp->qlen_last_fqs_check + qhimark) {
> -		/* ... or if many callbacks queued. */
> +		/* Callback overload condition. */
> +		WARN_ON_ONCE(!rdp->nocb_gp_wake_attempt && !rdp->nocb_gp_serving);

With this test, the point of ->nocb_gp_serving is unclear given that both
states are cleared in the same place but ->nocb_gp_serving is set later by
the gp kthread. ->nocb_gp_serving implies ->nocb_gp_wake_attempt so the above
test is the same as WARN_ON_ONCE(!rdp->nocb_gp_wake_attempt).

In fact ->nocb_gp_wake_attempt alone probably makes sense?

>  		rdp->qlen_last_fqs_check = len;
>  		j = jiffies;
>  		if (j != rdp->nocb_gp_adv_time &&
> @@ -575,21 +574,10 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  			rcu_advance_cbs_nowake(rdp->mynode, rdp);
>  			rdp->nocb_gp_adv_time = j;
>  		}
> -		smp_mb(); /* Enqueue before timer_pending(). */

You need to remove the pairing smp_mb__after_spin_lock() in
do_nocb_deferred_wakeup_timer().

Thanks.

-- 
Frederic Weisbecker
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ