[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxejvAmccYMTa4P1@J2N7QTR9R3>
Date: Tue, 22 Oct 2024 14:08:12 +0100
From: Mark Rutland <mark.rutland@....com>
To: Jinjie Ruan <ruanjinjie@...wei.com>
Cc: catalin.marinas@....com, will@...nel.org, oleg@...hat.com,
tglx@...utronix.de, peterz@...radead.org, luto@...nel.org,
kees@...nel.org, wad@...omium.org, rostedt@...dmis.org,
arnd@...db.de, ardb@...nel.org, broonie@...nel.org,
rick.p.edgecombe@...el.com, leobras@...hat.com,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v3 0/3] arm64: entry: Convert to generic entry
On Tue, Oct 22, 2024 at 08:07:54PM +0800, Jinjie Ruan wrote:
> On 2024/10/17 23:25, Mark Rutland wrote:
> > There's also some indirection that I don't think is necessary *and*
> > hides important ordering concerns and results in mistakes. In
> > particular, note that before this series, enter_from_kernel_mode() calls
> > the (instrumentable) MTE checks *after* all the necessary lockdep+RCU
> > management is performed by __enter_from_kernel_mode():
> >
> > static void noinstr enter_from_kernel_mode(struct pt_regs *regs)
> > {
> > __enter_from_kernel_mode(regs);
> > mte_check_tfsr_entry();
> > mte_disable_tco_entry(current);
> > }
> >
> > ... whereas after this series is applied, those MTE checks are placed in
> > arch_enter_from_kernel_mode(), which irqentry_enter() calls *before* the
> > necessary lockdep+RCU management. That is broken.
> >
> > It would be better to keep that explicit in the arm64 entry code with
> > arm64-specific wrappers, e.g.
> >
> > static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
> > {
> > irqentry_state_t state = irqentry_enter(regs);
> > mte_check_tfsr_entry();
> > mte_disable_tco_entry(current);
> >
> > return state;
> > }
>
> Hi, Mark, It seems that there is a problem for
> arm64_preempt_schedule_irq() when wrap irqentry_exit() with
> exit_to_kernel_mode().
>
> The arm64_preempt_schedule_irq() is about PREEMPT_DYNAMIC and preempt
> irq which is the raw_irqentry_exit_cond_resched() in generic code called
> by irqentry_exit().
>
> Only __el1_irq() call arm64_preempt_schedule_irq(), but when we switch
> all exit_to_kernel_mode() to arm64-specific one that wrap
> irqentry_exit(), not only __el1_irq() but also el1_abort(), el1_pc(),
> el1_undef() etc. will try to reschedule by calling
> arm64_preempt_schedule_irq() similar logic.
Yes, the generic entry code will preempt any context where an interrupt
*could* have been taken from.
I'm not sure it actually matters either way; I believe that the generic
entry code was written this way because that's what x86 did, and
checking for whether interrupts are enabled in the interrupted context
is cheap.
I's suggest you first write a patch to align arm64's entry code with the
generic code, by removing the call to arm64_preempt_schedule_irq() from
__el1_irq(), and adding a call to arm64_preempt_schedule_irq() in
__exit_to_kernel_mode(), e.g.
| static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
| {
| ...
| if (interrupts_enabled(regs)) {
| ...
| if (regs->exit_rcu) {
| ...
| }
| ...
| arm64_preempt_schedule_irq();
| ...
| } else {
| ...
| }
| }
That way the behaviour and structure will be more aligned with the
generic code, and with that as an independent patch it will be easier to
review/test/bisect/etc.
This change will have at least two key impacts:
(1) We'll preempt even without taking a "real" interrupt. That
shouldn't result in preemption that wasn't possible before,
but it does change the probability of preempting at certain points,
and might have a performance impact, so probably warrants a
benchmark.
(2) We will not preempt when taking interrupts from a region of kernel
code where IRQs are enabled but RCU is not watching, matching the
behaviour of the generic entry code.
This has the potential to introduce livelock if we can ever have a
screaming interrupt in such a region, so we'll need to go figure out
whether that's actually a problem.
Having this as a separate patch will make it easier to test/bisect
for that specifically.
Mark.
Powered by blists - more mailing lists