linux-kernel - Re: [PATCH -next 5/8] rcu/nocb: Add warning to detect if overload advancement is ever useful

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <654e3bde-764b-49ae-8faa-bab8199b1f15@nvidia.com>
Date: Tue, 13 Jan 2026 20:09:35 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: "Paul E . McKenney" <paulmck@...nel.org>,
 Boqun Feng <boqun.feng@...il.com>, rcu@...r.kernel.org
Cc: Frederic Weisbecker <frederic@...nel.org>,
 Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
 Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>,
 Steven Rostedt <rostedt@...dmis.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang@...ux.dev>,
 Shuah Khan <shuah@...nel.org>, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org
Subject: Re: [PATCH -next 5/8] rcu/nocb: Add warning to detect if overload
 advancement is ever useful

Since I am resubmitting the nocb patches in this series (3 of them from this
series) for the next merge window, I thought I'll replace this particular patch
with just a deletion of the rcu_advance_cbs_nowake() call itself instead of
bloating the code path with warnings and comments.

linux-next and many days of testing on my side are also looking good.

Thoughts?  Once I get any opinions, I'll change this patch to do the deletion.
Also I am adding one other (trivial) patch to this series:
https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=nocb-7.0&id=84669d678b9cb28ff8774a3b6457186a4a187c75

Running overnight tests on all 4 patches now...

thanks,

 - Joel

On 1/1/2026 11:34 AM, Joel Fernandes wrote:
> During callback overload, the NOCB code attempts an opportunistic
> advancement via rcu_advance_cbs_nowake().
> 
> Analysis via tracing with 300,000 callbacks flooded shows this
> optimization is likely dead code:
> - 30 overload conditions triggered
> - 0 advancements actually occurred
> - 100% of time no advancement due to current GP not done.
> 
> I also ran TREE05 and TREE08 for 2 hours and cannot trigger it.
> 
> When callbacks overflow (exceed qhimark), they are waiting for a grace
> period that hasn't completed yet. The optimization requires the GP to be
> complete to advance callbacks, but the overload condition itself is
> caused by callbacks piling up faster than GPs can complete. This creates
> a logical contradiction where the advancement cannot happen.
> 
> In *theory* this might be possible, the GP completed just in the nick of
> time as we hit the overload, but this is just so rare that it can be
> considered impossible when we cannot even hit it with synthetic callback
> flooding even, it is a waste of cycles to even try to advance, let alone
> be useful and is a maintenance burden complexity we don't need.
> 
> I suggest deletion. However, add a WARN_ON_ONCE for a merge window or 2
> and delete it after out of extreme caution.
> 
> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> ---
>  kernel/rcu/tree_nocb.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 7e9d465c8ab1..d3e6a0e77210 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -571,8 +571,20 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
>  		if (j != rdp->nocb_gp_adv_time &&
>  		    rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
>  		    rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
> +			long done_before = rcu_segcblist_get_seglen(&rdp->cblist, RCU_DONE_TAIL);
> +
>  			rcu_advance_cbs_nowake(rdp->mynode, rdp);
>  			rdp->nocb_gp_adv_time = j;
> +
> +			/*
> +			 * The advance_cbs call above is not useful. Under an
> +			 * overload condition, nocb_gp_wait() is always waiting
> +			 * for GP completion, due to this nothing can be moved
> +			 * from WAIT to DONE, in the list. WARN if an
> +			 * advancement happened (next step is deletion of advance).
> +			 */
> +			WARN_ON_ONCE(rcu_segcblist_get_seglen(&rdp->cblist,
> +				     RCU_DONE_TAIL) > done_before);
>  		}
>  	}
>