lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <FF0E8ADE-F5BF-44BB-A1A8-723299C1085C@nvidia.com>
Date: Mon, 19 Jan 2026 19:59:33 -0500
From: joelagnelf@...dia.com
To: paulmck@...nel.org
Cc: Frederic Weisbecker <frederic@...nel.org>, linux-kernel@...r.kernel.org, Boqun Feng <boqun.feng@...il.com>, rcu@...r.kernel.org, Neeraj Upadhyay <neeraj.upadhyay@...nel.org>, Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>, Steven Rostedt <rostedt@...dmis.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang@...ux.dev>
Subject: Re: [PATCH -next v3 2/3] rcu/nocb: Remove dead callback overload handling

> On Jan 19, 2026, at 7:07 PM, Paul E. McKenney <paulmck@...nel.org> wrote:
> 
> On Tue, Jan 20, 2026 at 12:53:26AM +0100, Frederic Weisbecker wrote:
>> Le Mon, Jan 19, 2026 at 06:12:22PM -0500, Joel Fernandes a écrit :
>>> During callback overload (exceeding qhimark), the NOCB code attempts
>>> opportunistic advancement via rcu_advance_cbs_nowake(). Analysis shows
>>> this entire code path is dead:
>>> 
>>> - 30 overload conditions triggered with 300,000 callback flood
>>> - 0 advancements actually occurred
>>> - 100% of time blocked because current GP not done
>>> 
>>> The overload condition triggers when callbacks are coming in at a high
>>> rate with GPs not completing as fast. But the advancement requires the
>>> GP to be complete - a logical contradiction. Even if the GP did complete
>>> in time, nocb_gp_wait() has to wake up anyway to do the advancement, so
>>> it is pointless.
>>> 
>>> Since the advancement is dead code, the entire overload handling block
>>> serves no purpose. Remove it entirely.
>>> 
>>> Suggested-by: Frederic Weisbecker <frederic@...nel.org>
>>> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
>> 
>> Reviewed-by: Frederic Weisbecker <frederic@...nel.org>
>> 
>> Would be nice to have Paul's ack as well, in case we missed something subtle
>> here.
> 
> Given that you are good with it, I will take a look.  And test it.  ;-)

Sure, thanks!

>> Also probably for upcoming merge window + 1, note that similar code with
>> similar removal opportunity resides in rcu_nocb_try_bypass().
>> And ->nocb_gp_adv_time could then be removed.
> 
> Further simplification sounds like a good thing!  Just not too simple,
> you understand!  ;-)

Yes I have some more queued in my local tree that I plan for merge window + 1. :-)

By the way, I have another recent idea: why don't we trigger nocb poll mode
automatically under overload condition? Currently rcu_nocb_poll is only set via
the boot parameter and stays constant. Testing shows me that poll mode can cause
GP completion faster during overload, so dynamically enabling it when we exceed
qhimark could be beneficial. The question then is how do we turn it off
dynamically as well - perhaps when callback count drops below qlowmark, and
using some debounce logic to avoid too frequent toggling?

>                            Thanx, Paul

thanks,

 - Joel


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ