[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52679e35-c6a4-4e0c-876a-a80a1a9b2bd8@nvidia.com>
Date: Sun, 6 Jul 2025 15:16:54 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: paulmck@...nel.org
Cc: linux-kernel@...r.kernel.org, Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>,
Uladzislau Rezki <urezki@...il.com>, Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang@...ux.dev>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>, rcu@...r.kernel.org,
linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH RFC 2/3] rcu: Refactor expedited handling check in
rcu_read_unlock_special()
Hi Paul,
On 7/6/2025 1:18 PM, Paul E. McKenney wrote:
> On Sat, Jul 05, 2025 at 04:39:16PM -0400, Joel Fernandes wrote:
>> Extract the complex expedited handling condition in rcu_read_unlock_special()
>> into a separate function rcu_unlock_needs_exp_handling() with detailed
>> comments explaining each condition.
>>
>> This improves code readability. No functional change intended.
>
> Very nice!!!
Thanks!
>
> Some questions and comments interspersed below.
I replied inline below:
>
> Thanx, Paul
>
>> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
>> ---
>> kernel/rcu/tree_plugin.h | 80 +++++++++++++++++++++++++++++++++++-----
>> 1 file changed, 71 insertions(+), 9 deletions(-)
>>
>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>> index baf57745b42f..8504d95bb35b 100644
>> --- a/kernel/rcu/tree_plugin.h
>> +++ b/kernel/rcu/tree_plugin.h
>> @@ -647,6 +647,72 @@ static void rcu_preempt_deferred_qs_handler(struct irq_work *iwp)
>> local_irq_restore(flags);
>> }
>>
>> +/*
>> + * Check if expedited grace period processing during unlock is needed.
>> + *
>> + * This function determines whether expedited handling is required based on:
>> + * 1. Task blocking an expedited grace period
>
> This is a heuristic. What we are actually checking is whether the task
> is blocking *some* grace period and whether at least one task (maybe
> this one, maybe not) is blocking an expedited grace period.
Makes sense, I changed this to:
* 1. Task blocking an expedited grace period (based on a heuristic, could be
* false-positive, see below.)
And the below comment to:
/*
* Check if this task is blocking an expedited grace period. If the
* task was preempted within an RCU read-side critical section and is
* on the expedited grace period blockers list (exp_tasks), we need
* expedited handling to unblock the expedited GP. This is not an exact
* check because 't' might not be on the exp_tasks list at all - its
* just a fast heuristic that can be false-positive sometimes.
*/
if (t->rcu_blocked_node && READ_ONCE(t->rcu_blocked_node->exp_tasks))
return true;
Hope that looks Ok.
>
> Why not an exact check? Because that would mean traversing the list
> starting at ->exp_tasks, and that list could potentially contain every
> task in the system. And I have received bug reports encountered on
> systems with hundreds of thousands of tasks.
Got it.
>
> I could imagine a more complex data structure that semi-efficiently
> tracked exact information, but I could also imagine this not being worth
> the effort.
>
>> + * 2. CPU participating in an expedited grace period
>> + * 3. Strict grace period mode requiring expedited handling
>> + * 4. RCU priority boosting needs when interrupts were disabled
>
> s/boosting/deboosting/
>
Fixed, thanks.
>
>> + */
>> + if (t->rcu_blocked_node && READ_ONCE(t->rcu_blocked_node->exp_tasks))
>> + return true;
>> +
>> + /*
>> + * Check if this CPU is participating in an expedited grace period.
>> + * The expmask bitmap tracks which CPUs need to check in for the
>> + * current expedited GP. If our CPU's bit is set, we need expedited
>> + * handling to help complete the expedited GP.
>> + */
>> + if (rdp->grpmask & READ_ONCE(rnp->expmask))
>> + return true;
>> +
>> + /*
>> + * In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels, all grace periods
>> + * are treated as short for testing purposes even if that means
>> + * disturbing the system more. Check if either:
>> + * - This CPU has not yet reported a quiescent state, or
>> + * - This task was preempted within an RCU critical section
>> + * In either case, requird expedited handling for strict GP mode.
>
> s/requird/required/ ;-)
I meant "require" :-D. Will fix.
>
>> + */
>> + if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) &&
>> + ((rdp->grpmask & READ_ONCE(rnp->qsmask)) || t->rcu_blocked_node))
>> + return true;
>> +
>> + /*
>> + * RCU priority boosting case: If a task is subject to RCU priority
>> + * boosting and exits an RCU read-side critical section with interrupts
>> + * disabled, we need expedited handling to ensure timely deboosting.
>> + * Without this, a low-priority task could incorrectly run at high
>> + * real-time priority for an extended period effecting real-time
>
> s/effecting/degrading/ to be more precise.
>
Fixed, thanks.
- Joel
Powered by blists - more mailing lists