[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dac18e2f-b40d-d02b-dad4-a849a6e6a764@huawei.com>
Date: Thu, 31 Oct 2024 12:02:04 +0800
From: Jinjie Ruan <ruanjinjie@...wei.com>
To: Mark Rutland <mark.rutland@....com>
CC: <oleg@...hat.com>, <linux@...linux.org.uk>, <will@...nel.org>,
<catalin.marinas@....com>, <sstabellini@...nel.org>, <maz@...nel.org>,
<tglx@...utronix.de>, <peterz@...radead.org>, <luto@...nel.org>,
<kees@...nel.org>, <wad@...omium.org>, <akpm@...ux-foundation.org>,
<samitolvanen@...gle.com>, <arnd@...db.de>, <ojeda@...nel.org>,
<rppt@...nel.org>, <hca@...ux.ibm.com>, <aliceryhl@...gle.com>,
<samuel.holland@...ive.com>, <paulmck@...nel.org>, <aquini@...hat.com>,
<petr.pavlu@...e.com>, <viro@...iv.linux.org.uk>,
<rmk+kernel@...linux.org.uk>, <ardb@...nel.org>,
<wangkefeng.wang@...wei.com>, <surenb@...gle.com>,
<linus.walleij@...aro.org>, <yangyj.ee@...il.com>, <broonie@...nel.org>,
<mbenes@...e.cz>, <puranjay@...nel.org>, <pcc@...gle.com>,
<guohanjun@...wei.com>, <sudeep.holla@....com>,
<Jonathan.Cameron@...wei.com>, <prarit@...hat.com>, <liuwei09@...tc.cn>,
<dwmw@...zon.co.uk>, <oliver.upton@...ux.dev>, <kristina.martsenko@....com>,
<ptosi@...gle.com>, <frederic@...nel.org>, <vschneid@...hat.com>,
<thiago.bauermann@...aro.org>, <joey.gouly@....com>,
<liuyuntao12@...wei.com>, <leobras@...hat.com>,
<linux-kernel@...r.kernel.org>, <linux-arm-kernel@...ts.infradead.org>,
<xen-devel@...ts.xenproject.org>
Subject: Re: [PATCH -next v4 06/19] arm64: entry: Move
arm64_preempt_schedule_irq() into exit_to_kernel_mode()
On 2024/10/29 22:52, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:47PM +0800, Jinjie Ruan wrote:
>> Move arm64_preempt_schedule_irq() into exit_to_kernel_mode(), so not
>> only __el1_irq() but also every time when kernel mode irq return,
>> there is a chance to reschedule.
>
> We use exit_to_kernel_mode() for every non-NMI exception return to the
> kernel, not just IRQ returns.
Yes, it it not only irq but other non-NMI exception, will update it.
>
>> As Mark pointed out, this change will have the following key impact:
>>
>> "We'll preempt even without taking a "real" interrupt. That
>> shouldn't result in preemption that wasn't possible before,
>> but it does change the probability of preempting at certain points,
>> and might have a performance impact, so probably warrants a
>> benchmark."
>
> For anyone following along at home, I said that at:
>
> https://lore.kernel.org/linux-arm-kernel/ZxejvAmccYMTa4P1@J2N7QTR9R3/
>
> ... and there I specifically said:
Thank you!
This one and the next patch will be merged as you suggested.
I would have thought it would have been clearer to put it in
__exit_to_kernel_mode() and move it to interrupt enabled block in two steps.
>
>> I's suggest you first write a patch to align arm64's entry code with the
>> generic code, by removing the call to arm64_preempt_schedule_irq() from
>> __el1_irq(), and adding a call to arm64_preempt_schedule_irq() in
>> __exit_to_kernel_mode(), e.g.
>>
>> | static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
>> | {
>> | ...
>> | if (interrupts_enabled(regs)) {
>> | ...
>> | if (regs->exit_rcu) {
>> | ...
>> | }
>> | ...
>> | arm64_preempt_schedule_irq();
>> | ...
>> | } else {
>> | ...
>> | }
>> | }
>
> [...]
>
>> +#ifdef CONFIG_PREEMPT_DYNAMIC
>> +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
>> +#define need_irq_preemption() \
>> + (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
>> +#else
>> +#define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION))
>> +#endif
>> +
>> +static void __sched arm64_preempt_schedule_irq(void)
>> +{
>> + if (!need_irq_preemption())
>> + return;
>> +
>> + /*
>> + * Note: thread_info::preempt_count includes both thread_info::count
>> + * and thread_info::need_resched, and is not equivalent to
>> + * preempt_count().
>> + */
>> + if (READ_ONCE(current_thread_info()->preempt_count) != 0)
>> + return;
>> +
>> + /*
>> + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
>> + * priority masking is used the GIC irqchip driver will clear DAIF.IF
>> + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
>> + * DAIF we must have handled an NMI, so skip preemption.
>> + */
>> + if (system_uses_irq_prio_masking() && read_sysreg(daif))
>> + return;
>> +
>> + /*
>> + * Preempting a task from an IRQ means we leave copies of PSTATE
>> + * on the stack. cpufeature's enable calls may modify PSTATE, but
>> + * resuming one of these preempted tasks would undo those changes.
>> + *
>> + * Only allow a task to be preempted once cpufeatures have been
>> + * enabled.
>> + */
>> + if (system_capabilities_finalized())
>> + preempt_schedule_irq();
>> +}
>> +
>> /*
>> * Handle IRQ/context state management when exiting to kernel mode.
>> * After this function returns it is not safe to call regular kernel code,
>> @@ -72,6 +114,8 @@ static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs)
>> static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
>> irqentry_state_t state)
>> {
>> + arm64_preempt_schedule_irq();
>
> This is broken; exit_to_kernel_mode() is called for any non-NMI return
> excpetion return to the kernel, and this doesn't check that interrupts
> were enabled in the context the exception was taken from.
>
> This will preempt in cases where we should not, e.g. if we WARN() in a section with
> IRQs disabled.
>
> Mark.
>
Powered by blists - more mailing lists