[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6nieJ2M6Ro7CeO_@J2N7QTR9R3>
Date: Mon, 10 Feb 2025 11:26:48 +0000
From: Mark Rutland <mark.rutland@....com>
To: Jinjie Ruan <ruanjinjie@...wei.com>, tglx@...utronix.de
Cc: catalin.marinas@....com, will@...nel.org, oleg@...hat.com,
sstabellini@...nel.org, peterz@...radead.org, luto@...nel.org,
mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, kees@...nel.org,
wad@...omium.org, akpm@...ux-foundation.org,
samitolvanen@...gle.com, masahiroy@...nel.org, hca@...ux.ibm.com,
aliceryhl@...gle.com, rppt@...nel.org, xur@...gle.com,
paulmck@...nel.org, arnd@...db.de, mbenes@...e.cz,
puranjay@...nel.org, pcc@...gle.com, ardb@...nel.org,
sudeep.holla@....com, guohanjun@...wei.com, rafael@...nel.org,
liuwei09@...tc.cn, dwmw@...zon.co.uk, Jonathan.Cameron@...wei.com,
liaochang1@...wei.com, kristina.martsenko@....com, ptosi@...gle.com,
broonie@...nel.org, thiago.bauermann@...aro.org,
kevin.brodsky@....com, joey.gouly@....com, liuyuntao12@...wei.com,
leobras@...hat.com, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
xen-devel@...ts.xenproject.org
Subject: Re: [PATCH -next v5 03/22] arm64: entry: Move
arm64_preempt_schedule_irq() into __exit_to_kernel_mode()
On Fri, Dec 06, 2024 at 06:17:25PM +0800, Jinjie Ruan wrote:
> The generic entry code try to reschedule every time when the kernel
> mode non-NMI exception return. At the moment, arm64 only reschedule every
> time when EL1 irq exception return;
I think this is a bit unclear, and should say something like:
| The arm64 entry code only preempts a kernel context upon a return from
| a regular IRQ exception. The generic entry code may preempt a kernel
| context for any exception return where irqentry_exit() is used, and so
| may preempt other exceptions such as faults.
Thomas, can you confirm that's the *intent* of the generic entry code?
> In preparation for moving arm64 over to the generic entry code, move
> arm64_preempt_schedule_irq() into exit_to_kernel_mode(), so not
> only EL1 irq but also all EL1 non-NMI exception return, there is a chance
> to reschedule. And only if irqs are enabled when the exception trapped,
> there may be a chance to reschedule after the exceptions have been handled,
> so move arm64_preempt_schedule_irq() into regs_irqs_disabled()
> check false block, but it will try to reschedule only when TINY_RCU is
> enabled or current is not an idle task.
I think the detail is confusing here, and it would be better to say:
| In preparation for moving arm64 over to the generic entry code, align
| arm64 with the generic behaviour by calling
| arm64_preempt_schedule_irq() from exit_to_kernel_mode(). To make this
| possible, arm64_preempt_schedule_irq() and need_irq_preemption() are
| moved earlier in the file, with no changes.
Mark.
> As Mark pointed out, this change will have the following 2 key impact:
>
> - " We'll preempt even without taking a "real" interrupt. That
> shouldn't result in preemption that wasn't possible before,
> but it does change the probability of preempting at certain points,
> and might have a performance impact, so probably warrants a
> benchmark."
>
> - " We will not preempt when taking interrupts from a region of kernel
> code where IRQs are enabled but RCU is not watching, matching the
> behaviour of the generic entry code.
>
> This has the potential to introduce livelock if we can ever have a
> screaming interrupt in such a region, so we'll need to go figure out
> whether that's actually a problem.
>
> Having this as a separate patch will make it easier to test/bisect
> for that specifically."
>
> Suggested-by: Mark Rutland <mark.rutland@....com>
> Signed-off-by: Jinjie Ruan <ruanjinjie@...wei.com>
> ---
> arch/arm64/kernel/entry-common.c | 88 ++++++++++++++++----------------
> 1 file changed, 44 insertions(+), 44 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index 1687627b2ecf..7a588515ee07 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -75,6 +75,48 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
> return state;
> }
>
> +#ifdef CONFIG_PREEMPT_DYNAMIC
> +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
> +#define need_irq_preemption() \
> + (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
> +#else
> +#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
> +#endif
> +
> +static void __sched arm64_preempt_schedule_irq(void)
> +{
> + if (!need_irq_preemption())
> + return;
> +
> + /*
> + * Note: thread_info::preempt_count includes both thread_info::count
> + * and thread_info::need_resched, and is not equivalent to
> + * preempt_count().
> + */
> + if (READ_ONCE(current_thread_info()->preempt_count) != 0)
> + return;
> +
> + /*
> + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
> + * priority masking is used the GIC irqchip driver will clear DAIF.IF
> + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
> + * DAIF we must have handled an NMI, so skip preemption.
> + */
> + if (system_uses_irq_prio_masking() && read_sysreg(daif))
> + return;
> +
> + /*
> + * Preempting a task from an IRQ means we leave copies of PSTATE
> + * on the stack. cpufeature's enable calls may modify PSTATE, but
> + * resuming one of these preempted tasks would undo those changes.
> + *
> + * Only allow a task to be preempted once cpufeatures have been
> + * enabled.
> + */
> + if (system_capabilities_finalized())
> + preempt_schedule_irq();
> +}
> +
> /*
> * Handle IRQ/context state management when exiting to kernel mode.
> * After this function returns it is not safe to call regular kernel code,
> @@ -97,6 +139,8 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs,
> return;
> }
>
> + arm64_preempt_schedule_irq();
> +
> trace_hardirqs_on();
> } else {
> if (state.exit_rcu)
> @@ -281,48 +325,6 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs,
> lockdep_hardirqs_on(CALLER_ADDR0);
> }
>
> -#ifdef CONFIG_PREEMPT_DYNAMIC
> -DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
> -#define need_irq_preemption() \
> - (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
> -#else
> -#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
> -#endif
> -
> -static void __sched arm64_preempt_schedule_irq(void)
> -{
> - if (!need_irq_preemption())
> - return;
> -
> - /*
> - * Note: thread_info::preempt_count includes both thread_info::count
> - * and thread_info::need_resched, and is not equivalent to
> - * preempt_count().
> - */
> - if (READ_ONCE(current_thread_info()->preempt_count) != 0)
> - return;
> -
> - /*
> - * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
> - * priority masking is used the GIC irqchip driver will clear DAIF.IF
> - * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
> - * DAIF we must have handled an NMI, so skip preemption.
> - */
> - if (system_uses_irq_prio_masking() && read_sysreg(daif))
> - return;
> -
> - /*
> - * Preempting a task from an IRQ means we leave copies of PSTATE
> - * on the stack. cpufeature's enable calls may modify PSTATE, but
> - * resuming one of these preempted tasks would undo those changes.
> - *
> - * Only allow a task to be preempted once cpufeatures have been
> - * enabled.
> - */
> - if (system_capabilities_finalized())
> - preempt_schedule_irq();
> -}
> -
> static void do_interrupt_handler(struct pt_regs *regs,
> void (*handler)(struct pt_regs *))
> {
> @@ -591,8 +593,6 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
> do_interrupt_handler(regs, handler);
> irq_exit_rcu();
>
> - arm64_preempt_schedule_irq();
> -
> exit_to_kernel_mode(regs, state);
> }
> static void noinstr el1_interrupt(struct pt_regs *regs,
> --
> 2.34.1
>
Powered by blists - more mailing lists