lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20251227001239.405644-4-joelagnelf@nvidia.com>
Date: Fri, 26 Dec 2025 19:12:39 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: "Paul E . McKenney" <paulmck@...nel.org>,
	Frederic Weisbecker <frederic@...nel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Uladzislau Rezki <urezki@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Zqiang <qiang.zhang@...ux.dev>
Cc: rcu@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Joel Fernandes <joelagnelf@...dia.com>
Subject: [PATCH v2 3/3] rcu/nocb: Add warning to detect if overload advancement is ever useful

During callback overload, the NOCB code attempts an opportunistic
advancement via rcu_advance_cbs_nowake().

Analysis via tracing with 300,000 callbacks flooded shows this
optimization is likely dead code:
- 30 overload conditions triggered
- 0 advancements actually occurred
- 100% of time no advancement due to current GP not done.

I also ran TREE05 and TREE08 for 2 hours and cannot trigger it.

When callbacks overflow (exceed qhimark), they are waiting for a grace
period that hasn't completed yet. The optimization requires the GP to be
complete to advance callbacks, but the overload condition itself is
caused by callbacks piling up faster than GPs can complete. This creates
a logical contradiction where the advancement cannot happen.

In *theory* this might be possible, the GP completed just in the nick of
time as we hit the overload, but this is just so rare that it can be
considered impossible when we cannot even hit it with synthetic callback
flooding even, it is a waste of cycles to even try to advance, let alone
be useful and is a maintenance burden complexity we don't need.

I suggest deletion. However, add a WARN_ON_ONCE for a merge window or 2
and delete it after out of extreme caution.

Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
---
 kernel/rcu/tree_nocb.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 7e9d465c8ab1..d3e6a0e77210 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -571,8 +571,20 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
 		if (j != rdp->nocb_gp_adv_time &&
 		    rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
 		    rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
+			long done_before = rcu_segcblist_get_seglen(&rdp->cblist, RCU_DONE_TAIL);
+
 			rcu_advance_cbs_nowake(rdp->mynode, rdp);
 			rdp->nocb_gp_adv_time = j;
+
+			/*
+			 * The advance_cbs call above is not useful. Under an
+			 * overload condition, nocb_gp_wait() is always waiting
+			 * for GP completion, due to this nothing can be moved
+			 * from WAIT to DONE, in the list. WARN if an
+			 * advancement happened (next step is deletion of advance).
+			 */
+			WARN_ON_ONCE(rcu_segcblist_get_seglen(&rdp->cblist,
+				     RCU_DONE_TAIL) > done_before);
 		}
 	}
 
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ