lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1b547da-23b7-48c7-8566-56be4ef2cef0@nvidia.com>
Date: Fri, 9 Jan 2026 16:20:25 -0500
From: Joel Fernandes <joelagnelf@...dia.com>
To: Frederic Weisbecker <frederic@...nel.org>,
 Joel Fernandes <joel@...lfernandes.org>
Cc: "Paul E . McKenney" <paulmck@...nel.org>,
 Boqun Feng <boqun.feng@...il.com>, rcu@...r.kernel.org,
 Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
 Josh Triplett <josh@...htriplett.org>, Uladzislau Rezki <urezki@...il.com>,
 Steven Rostedt <rostedt@...dmis.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang@...ux.dev>,
 Shuah Khan <shuah@...nel.org>, linux-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org
Subject: Re: [PATCH -next 4/8] rcu/nocb: Add warning if no rcuog wake up
 attempt happened during overload



On 1/9/2026 9:36 AM, Frederic Weisbecker wrote:
> Le Thu, Jan 08, 2026 at 10:49:30PM -0500, Joel Fernandes a écrit :
>>>> @@ -688,6 +690,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
>>>>  		     bypass_ncbs > 2 * qhimark)) {
>>>>  			flush_bypass = true;
>>>>  		} else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
>>>> +			rdp->nocb_gp_wake_attempt = false;
>>>
>>> This is when nocb_cb_wait() is done with callbacks but nocb_gp_wait() is done
>>> with them sooner, when the grace period is done for all pending callbacks.
>>>
>>> Something like this would perhaps be more accurate:
>>>
>>> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
>>> index e6cd56603cad..52010cbeaa76 100644
>>> --- a/kernel/rcu/tree_nocb.h
>>> +++ b/kernel/rcu/tree_nocb.h
>>> @@ -746,6 +746,8 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
>>>  			needwait_gp = true;
>>>  			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>>>  					    TPS("NeedWaitGP"));
>>> +		} else if (!rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
>>> +			rdp->nocb_gp_wake_attempt = false;
>>>  		}
>>
>> Hmm, I am trying to understand why this suggestion is better than what I
>> already have. It is one extra line and adds another conditional.
>>
>> Also shouldn't it be:
>>
>>   } else if (!rcu_cblist_n_cbs(&rdp->nocb_bypass) &&
>>              rcu_segcblist_empty(&rdp->cblist)) {
>>       rdp->nocb_gp_wake_attempt = false;
>>   }
>>
>>   ?
> 
> This else already means that rcu_segcblist_nextgp() returned false because there
> is no pending callbacks. rcu_segcblist_empty() is different because it also test
> done callbacks.
> 
>>
>> My goal was to mark wake_attempt as false when ALL callbacks on the rdp were
>> drained. IOW, the GP thread is done with the rdp.
> 
> So nocb_gp_wait (the rcuog kthreads) handle the pending callbacks, advancing
> them throughout grace periods until they are moved to the done state.
> 
> But once they are moved as done, the callbacks are the responsibility of
> nocb_cb_wait() (the rcuo kthreads) and nocb_gp_wait() stops paying attention
> to that rdp if there are no more pending callbacks.
> 
> So with your initial patch, you still have rdp->nocb_gp_wake_attempt == true
> even when there are only done callbacks. But without an appropriate wake-up
> after subsequent enqueue, nocb_gp_wait() may ignore new callbacks, event though
> rdp->nocb_gp_wake_attempt == true suggests otherwise.

Ah, got it! I was clubbing the acting of waking up rcuog to that of both the
rcuog and rcuop/s threads. Your suggestion, instead, is more accurate so I will
do it that way instead. Thanks for the explanations!

 - Joel


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ