[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0965542e-80a7-4837-b14e-903c635aa828@paulmck-laptop>
Date: Sun, 10 Mar 2024 20:56:02 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
tglx@...utronix.de, peterz@...radead.org,
torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
luto@...nel.org, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, willy@...radead.org, mgorman@...e.de,
jpoimboe@...nel.org, mark.rutland@....com, jgross@...e.com,
andrew.cooper3@...rix.com, bristot@...nel.org,
mathieu.desnoyers@...icios.com, geert@...ux-m68k.org,
glaubitz@...sik.fu-berlin.de, anton.ivanov@...bridgegreys.com,
mattst88@...il.com, krypton@...ich-teichert.org,
rostedt@...dmis.org, David.Laight@...lab.com, richard@....at,
mjguzik@...il.com, jon.grimm@....com, bharata@....com,
raghavendra.kt@....com, boris.ostrovsky@...cle.com,
konrad.wilk@...cle.com, rcu@...r.kernel.org
Subject: Re: [PATCH 15/30] rcu: handle quiescent states for PREEMPT_RCU=n,
PREEMPT_COUNT=y
On Sun, Mar 10, 2024 at 08:48:28PM -0400, Joel Fernandes wrote:
> On 3/10/2024 2:56 PM, Paul E. McKenney wrote:
> > On Sun, Mar 10, 2024 at 06:03:30AM -0400, Joel Fernandes wrote:
> >> Hello Ankur and Paul,
> >>
> >> On Mon, Feb 12, 2024 at 09:55:39PM -0800, Ankur Arora wrote:
> >>> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
> >>> states for read-side critical sections via rcu_all_qs().
> >>> One reason why this was necessary: lacking preempt-count, the tick
> >>> handler has no way of knowing whether it is executing in a read-side
> >>> critical section or not.
> >>>
> >>> With PREEMPT_AUTO=y, there can be configurations with (PREEMPT_COUNT=y,
> >>> PREEMPT_RCU=n). This means that cond_resched() is a stub which does
> >>> not provide for quiescent states via rcu_all_qs().
> >>>
> >>> So, use the availability of preempt_count() to report quiescent states
> >>> in rcu_flavor_sched_clock_irq().
> >>>
> >>> Suggested-by: Paul E. McKenney <paulmck@...nel.org>
> >>> Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
> >>> ---
> >>> kernel/rcu/tree_plugin.h | 11 +++++++----
> >>> 1 file changed, 7 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> >>> index 26c79246873a..9b72e9d2b6fe 100644
> >>> --- a/kernel/rcu/tree_plugin.h
> >>> +++ b/kernel/rcu/tree_plugin.h
> >>> @@ -963,13 +963,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
> >>> */
> >>> static void rcu_flavor_sched_clock_irq(int user)
> >>> {
> >>> - if (user || rcu_is_cpu_rrupt_from_idle()) {
> >>> + if (user || rcu_is_cpu_rrupt_from_idle() ||
> >>> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
> >>> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) {
> >>
> >> I was wondering if it makes sense to even support !PREEMPT_RCU under
> >> CONFIG_PREEMPT_AUTO.
> >>
> >> AFAIU, this CONFIG_PREEMPT_AUTO series preempts the kernel on
> >> the next tick boundary in the worst case, with all preempt modes including
> >> the preempt=none mode.
> >>
> >> Considering this, does it makes sense for RCU to be non-preemptible in
> >> CONFIG_PREEMPT_AUTO=y? Because if that were the case, and a read-side critical
> >> section extended beyond the tick, then it prevents the PREEMPT_AUTO preemption
> >> from happening, because rcu_read_lock() would preempt_disable().
> >
> > Yes, it does make sense for RCU to be non-preemptible in kernels
> > built with CONFIG_PREEMPT_AUTO=y and either CONFIG_PREEMPT_NONE=y or
> > CONFIG_PREEMPT_VOLUNTARY=y.
> > As noted in earlier discussions, there are
>
> Sorry if I missed a discussion, appreciate a link.
It is part of the discussion of the first version of this patch series,
if I recall correctly.
> > systems that are adequately but not abundantly endowed with memory.
> > Such systems need non-preemptible RCU to avoid preempted-reader OOMs.
>
> Then why don't such systems have a problem with CONFIG_PREEMPT_DYNAMIC=y and
> preempt=none mode? CONFIG_PREEMPT_DYNAMIC forces CONFIG_PREEMPT_RCU=y. There's
> no way to set CONFIG_PREEMPT_RCU=n with CONFIG_PREEMPT_DYNAMIC=y and
> preempt=none boot parameter. IMHO, if this feature is inconsistent with
> CONFIG_PREEMPT_DYNAMIC, that makes it super confusing. In fact, I feel
> CONFIG_PREEMPT_AUTO should instead just be another "preempt=auto" boot parameter
> mode added to CONFIG_PREEMPT_DYNAMIC feature, otherwise the proliferation of
> CONFIG_PREEMPT config options is getting a bit insane. And likely going to be
> burden to the users configuring the PREEMPT Kconfig option IMHO.
Because such systems are built with CONFIG_PREEMPT_DYNAMIC=n.
You could argue that we should just build with CONFIG_PREEMPT_AUTO=n,
but the long-term goal of eliminating cond_resched() will make that
ineffective.
> > Note well that non-preemptible RCU explicitly disables preemption across
> > all RCU readers.
>
> Yes, I mentioned this 'disabling preemption' aspect in my last email. My point
> being, unlike CONFIG_PREEMPT_NONE, CONFIG_PREEMPT_AUTO allows for kernel
> preemption in preempt=none. So the "Don't preempt the kernel" behavior has
> changed. That is, preempt=none under CONFIG_PREEMPT_AUTO is different from
> CONFIG_PREEMPT_NONE=y already. Here we *are* preempting. And RCU is getting on
> the way. It is like saying, you want an option for CONFIG_PREEMPT_RCU to be set
> to =n for CONFIG_PREEMPT=y kernels, sighting users who want a fully-preemptible
> kernel but are worried about reader preemptions.
Such users can simply avoid building with either CONFIG_PREEMPT_NONE=y
or with CONFIG_PREEMPT_VOLUNTARY=y. They might also experiment with
CONFIG_RCU_BOOST=y, and also with short timeouts until boosting.
If that doesn't do what they need, we talk with them and either help
them configure their kernels, make RCU do what they need, or help work
out some other way for them to get their jobs done.
> That aside, as such, I do agree your point of view, that preemptible readers
> presents a problem to folks using preempt=none in this series and we could
> decide to keep CONFIG_PREEMPT_RCU optional for whoever wants it that way. I was
> just saying that I want CONFIG_PREEMPT_AUTO's preempt=none mode to be somewhat
> consistent with CONFIG_PREEMPT_DYNAMIC's preempt=none. Because I'm pretty sure a
> week from now, no one will likely be able to tell the difference ;-). So IMHO
> either CONFIG_PREEMPT_DYNAMIC should be changed to make CONFIG_PREEMPT_RCU
> optional, or this series should be altered to force CONFIG_PREEMPT_RCU=y.
>
> Let me know if I missed something.
Why not key off of the value of CONFIG_PREEMPT_DYNAMIC? That way,
if both CONFIG_PREEMPT_AUTO=y and CONFIG_PREEMPT_DYNAMIC=y, RCU is
always preemptible. Then CONFIG_PREEMPT_DYNAMIC=y enables boot-time
(and maybe even run-time) switching between preemption flavors, while
CONFIG_PREEMPT_AUTO instead enables unconditional preemption of any
region of code that has not explicitly disabled preemption (or irq or
bh or whatever).
But one way or another, we really do need non-preemptible RCU in
conjunction with CONFIG_PREEMPT_AUTO=y.
Also, I don't yet see CONFIG_PREEMPT_AUTO in -next.
Thanx, Paul
Powered by blists - more mailing lists