lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <986b4f25-5fbf-45b1-8475-721bc3e95223@paulmck-laptop>
Date: Thu, 22 Jan 2026 13:55:11 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joelagnelf@...dia.com>
Cc: linux-kernel@...r.kernel.org, Boqun Feng <boqun.feng@...il.com>,
	rcu@...r.kernel.org, Frederic Weisbecker <frederic@...nel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Uladzislau Rezki <urezki@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang@...ux.dev>
Subject: Re: [PATCH -next v3 2/3] rcu/nocb: Remove dead callback overload
 handling

On Mon, Jan 19, 2026 at 06:12:22PM -0500, Joel Fernandes wrote:
> During callback overload (exceeding qhimark), the NOCB code attempts
> opportunistic advancement via rcu_advance_cbs_nowake(). Analysis shows
> this entire code path is dead:
> 
> - 30 overload conditions triggered with 300,000 callback flood
> - 0 advancements actually occurred
> - 100% of time blocked because current GP not done
> 
> The overload condition triggers when callbacks are coming in at a high
> rate with GPs not completing as fast. But the advancement requires the
> GP to be complete - a logical contradiction. Even if the GP did complete
> in time, nocb_gp_wait() has to wake up anyway to do the advancement, so
> it is pointless.
> 
> Since the advancement is dead code, the entire overload handling block
> serves no purpose. Remove it entirely.
> 
> Suggested-by: Frederic Weisbecker <frederic@...nel.org>
> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> ---
>  kernel/rcu/tree_nocb.h | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index f525e4f7985b..64a8ff350f92 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -526,8 +526,6 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  				 __releases(rdp->nocb_lock)
>  {
>  	long bypass_len;
> -	unsigned long cur_gp_seq;
> -	unsigned long j;
>  	long lazy_len;
>  	long len;
>  	struct task_struct *t;
> @@ -562,16 +560,6 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  		}
>  
>  		return;
> -	} else if (len > rdp->qlen_last_fqs_check + qhimark) {
> -		/* ... or if many callbacks queued. */
> -		rdp->qlen_last_fqs_check = len;
> -		j = jiffies;
> -		if (j != rdp->nocb_gp_adv_time &&
> -		    rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&

This places in cur_gp_seq not the grace period for the current callback
(which would be unlikely to have finished), but rather the grace period
for the oldest callback that has not yet been marked as done.  And that
callback started some time ago, and thus might well have finished.

So while this code might not have been executed in your tests, it is
definitely not a logical contradiction.

Or am I missing something subtle here?

						Thanx, Paul

> -		    rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
> -			rcu_advance_cbs_nowake(rdp->mynode, rdp);
> -			rdp->nocb_gp_adv_time = j;
> -		}
>  	}
>  
>  	rcu_nocb_unlock(rdp);
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ