[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87lear4wj6.fsf@oracle.com>
Date: Mon, 20 Nov 2023 19:26:05 -0800
From: Ankur Arora <ankur.a.arora@...cle.com>
To: paulmck@...nel.org
Cc: Ankur Arora <ankur.a.arora@...cle.com>,
linux-kernel@...r.kernel.org, tglx@...utronix.de,
peterz@...radead.org, torvalds@...ux-foundation.org,
linux-mm@...ck.org, x86@...nel.org, akpm@...ux-foundation.org,
luto@...nel.org, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, willy@...radead.org, mgorman@...e.de,
jon.grimm@....com, bharata@....com, raghavendra.kt@....com,
boris.ostrovsky@...cle.com, konrad.wilk@...cle.com,
jgross@...e.com, andrew.cooper3@...rix.com, mingo@...nel.org,
bristot@...nel.org, mathieu.desnoyers@...icios.com,
geert@...ux-m68k.org, glaubitz@...sik.fu-berlin.de,
anton.ivanov@...bridgegreys.com, mattst88@...il.com,
krypton@...ich-teichert.org, rostedt@...dmis.org,
David.Laight@...lab.com, richard@....at, mjguzik@...il.com
Subject: Re: [RFC PATCH 48/86] rcu: handle quiescent states for PREEMPT_RCU=n
Paul E. McKenney <paulmck@...nel.org> writes:
> On Tue, Nov 07, 2023 at 01:57:34PM -0800, Ankur Arora wrote:
>> cond_resched() is used to provide urgent quiescent states for
>> read-side critical sections on PREEMPT_RCU=n configurations.
>> This was necessary because lacking preempt_count, there was no
>> way for the tick handler to know if we were executing in RCU
>> read-side critical section or not.
>>
>> An always-on CONFIG_PREEMPT_COUNT, however, allows the tick to
>> reliably report quiescent states.
>>
>> Accordingly, evaluate preempt_count() based quiescence in
>> rcu_flavor_sched_clock_irq().
>>
>> Suggested-by: Paul E. McKenney <paulmck@...nel.org>
>> Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
>> ---
>> kernel/rcu/tree_plugin.h | 3 ++-
>> kernel/sched/core.c | 15 +--------------
>> 2 files changed, 3 insertions(+), 15 deletions(-)
>>
>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>> index f87191e008ff..618f055f8028 100644
>> --- a/kernel/rcu/tree_plugin.h
>> +++ b/kernel/rcu/tree_plugin.h
>> @@ -963,7 +963,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
>> */
>> static void rcu_flavor_sched_clock_irq(int user)
>> {
>> - if (user || rcu_is_cpu_rrupt_from_idle()) {
>> + if (user || rcu_is_cpu_rrupt_from_idle() ||
>> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {
>
> This looks good.
>
>> /*
>> * Get here if this CPU took its interrupt from user
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index bf5df2b866df..15db5fb7acc7 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -8588,20 +8588,7 @@ int __sched _cond_resched(void)
>> preempt_schedule_common();
>> return 1;
>> }
>> - /*
>> - * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
>> - * whether the current CPU is in an RCU read-side critical section,
>> - * so the tick can report quiescent states even for CPUs looping
>> - * in kernel context. In contrast, in non-preemptible kernels,
>> - * RCU readers leave no in-memory hints, which means that CPU-bound
>> - * processes executing in kernel context might never report an
>> - * RCU quiescent state. Therefore, the following code causes
>> - * cond_resched() to report a quiescent state, but only when RCU
>> - * is in urgent need of one.
>> - * /
>> -#ifndef CONFIG_PREEMPT_RCU
>> - rcu_all_qs();
>> -#endif
>
> But...
>
> Suppose we have a long-running loop in the kernel that regularly
> enables preemption, but only momentarily. Then the added
> rcu_flavor_sched_clock_irq() check would almost always fail, making
> for extremely long grace periods.
So, my thinking was that if RCU wants to end a grace period, it would
force a context switch by setting TIF_NEED_RESCHED (and as patch 38 mentions
RCU always uses the the eager version) causing __schedule() to call
rcu_note_context_switch().
That's similar to the preempt_schedule_common() case in the
_cond_resched() above.
But if I see your point, RCU might just want to register a quiescent
state and for this long-running loop rcu_flavor_sched_clock_irq() does
seem to fall down.
> Or did I miss a change that causes preempt_enable() to help RCU out?
Something like this?
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index dc5125b9c36b..e50f358f1548 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -222,6 +222,8 @@ do { \
barrier(); \
if (unlikely(preempt_count_dec_and_test())) \
__preempt_schedule(); \
+ if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) \
+ rcu_all_qs(); \
} while (0)
Though I do wonder about the likelihood of hitting the case you describe
and maybe instead of adding the check on every preempt_enable()
it might be better to instead force a context switch in the
rcu_flavor_sched_clock_irq() (as we do in the PREEMPT_RCU=y case.)
Thanks
--
ankur
Powered by blists - more mailing lists