[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87wmq8pop1.ffs@tglx>
Date: Mon, 11 Mar 2024 20:12:58 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Joel Fernandes <joel@...lfernandes.org>, Ankur Arora
<ankur.a.arora@...cle.com>
Cc: paulmck@...nel.org, linux-kernel@...r.kernel.org, peterz@...radead.org,
torvalds@...ux-foundation.org, akpm@...ux-foundation.org, luto@...nel.org,
bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
willy@...radead.org, mgorman@...e.de, jpoimboe@...nel.org,
mark.rutland@....com, jgross@...e.com, andrew.cooper3@...rix.com,
bristot@...nel.org, mathieu.desnoyers@...icios.com, geert@...ux-m68k.org,
glaubitz@...sik.fu-berlin.de, anton.ivanov@...bridgegreys.com,
mattst88@...il.com, krypton@...ich-teichert.org, rostedt@...dmis.org,
David.Laight@...lab.com, richard@....at, mjguzik@...il.com,
jon.grimm@....com, bharata@....com, raghavendra.kt@....com,
boris.ostrovsky@...cle.com, konrad.wilk@...cle.com, rcu@...r.kernel.org
Subject: Re: [PATCH 15/30] rcu: handle quiescent states for PREEMPT_RCU=n,
PREEMPT_COUNT=y
On Mon, Mar 11 2024 at 11:25, Joel Fernandes wrote:
> On 3/11/2024 1:18 AM, Ankur Arora wrote:
>>> Yes, I mentioned this 'disabling preemption' aspect in my last email. My point
>>> being, unlike CONFIG_PREEMPT_NONE, CONFIG_PREEMPT_AUTO allows for kernel
>>> preemption in preempt=none. So the "Don't preempt the kernel" behavior has
>>> changed. That is, preempt=none under CONFIG_PREEMPT_AUTO is different from
>>> CONFIG_PREEMPT_NONE=y already. Here we *are* preempting. And RCU is getting on
>>
>> I think that's a view from too close to the implementation. Someone
>> using the kernel is not necessarily concered with whether tasks are
>> preempted or not. They are concerned with throughput and latency.
>
> No, we are not only talking about that (throughput/latency). We are also talking
> about the issue related to RCU reader-preemption causing OOM (well and that
> could hurt both throughput and latency as well).
That happens only when PREEMPT_RCU=y. For PREEMPT_RCU=n the read side
critical sections still have preemption disabled.
> With CONFIG_PREEMPT_AUTO=y, you now preempt in the preempt=none mode. Something
> very different from the classical CONFIG_PREEMPT_NONE=y.
In PREEMPT_RCU=y and preempt=none mode this happens only when really
required, i.e. when the task does not schedule out or returns to user
space on time, or when a higher scheduling class task gets runnable. For
the latter the jury is still out whether this should be done or just
lazily defered like the SCHED_OTHER preemption requests.
In any case for that to matter this forced preemption would need to
preempt a RCU read side critical section and then keep the preempted
task away from the CPU for a long time.
That's very different from the unconditional kernel preemption model which
preempt=full provides and only marginally different from the existing
PREEMPT_NONE model. I know there might be dragons, but I'm not convinced
yet that this is an actual problem.
OTOH, doesn't PREEMPT_RCU=y have mechanism to mitigate that already?
> Essentially this means preemption is now more aggressive from the point of view
> of a preempt=none user. I was suggesting that, a point of view could be RCU
> should always support preepmtiblity (don't give PREEEMPT_RCU=n option) because
> AUTO *does preempt* unlike classic CONFIG_PREEMPT_NONE. Otherwise it is
> inconsistent -- say with CONFIG_PREEMPT=y (another *preemption mode*) which
> forces CONFIG_PREEMPT_RCU. However to Paul's point, we need to worry about those
> users who are concerned with running out of memory due to reader
> preemption.
What's wrong with the combination of PREEMPT_AUTO=y and PREEMPT_RCU=n?
Paul and me agreed long ago that this needs to be supported.
> In that vain, maybe we should also support CONFIG_PREEMPT_RCU=n for
> CONFIG_PREEMPT=y as well. There are plenty of popular systems with relatively
> low memory that need low latency (like some low-end devices / laptops
> :-)).
I'm not sure whether that's useful as the goal is to get rid of all the
CONFIG_PREEMPT_FOO options, no?
I'd rather spend brain cycles on figuring out whether RCU can be flipped
over between PREEMPT_RCU=n/y at boot or obviously run-time.
Thanks,
tglx
Powered by blists - more mailing lists