[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87pl6m451r.ffs@tglx>
Date: Tue, 03 Feb 2026 09:14:40 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: arnd@...db.de, anna-maria@...utronix.de, frederic@...nel.org,
luto@...nel.org, mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
linux-kernel@...r.kernel.org, oliver.sang@...el.com
Subject: Re: [PATCH v2 5/6] entry,hrtimer: Push reprogramming timers into
the interrupt return path
On Tue, Feb 03 2026 at 00:28, Thomas Gleixner wrote:
> On Mon, Feb 02 2026 at 17:33, Peter Zijlstra wrote:
>> So a nested IRQ at this point will have !user_mode(), but I think it can
>> still end up in softirqs due to that hardirq_stack_inuse. Should we
>> perhaps make sure only user_mode() ends up in softirqs?
>
> All interrupts independent of the mode they hit are ending up in
> irq_exit_rcu() and therefore in __irq_exit_rcu()
>
> run_irq_on_irqstack_cond()
> if (user_mode() || hardirq_stack_inuse)
> // Stay on user or hardirq stack
> irq_enter_rcu();
> func_c();
> irq_exit_rcu()
> else
> // MAGIC ASM to switch to hardirq stack
> call irq_enter_rcu
> call func_c
> call irq_exit_rcu
>
> The only reason why invoke_softirq() won't be called is when the
> interrupt hits into the softirq processing region of the previous
> interrupt, which means it's already on the hardirq stack.
In the case I pointed out where the second interrupt hits right after
exit to user enabled interupts, there is no nesting and it will happily
take the second path which switches to the hardirq stack and then on
return processes soft interrupts.
> But looking at this there is already a problem without interrupt
> nesting:
>
> irq_enter_rcu();
> timer_interrupt()
> hrtimer_interrupt()
> delay_rearm();
> irq_exit_rcu()
> __irq_exit_rcu()
> invoke_softirq() <- Here
>
> Soft interrupts can run for quite some time, which means this already
> can cause timers being delayed for way too long. I think in
> __irq_exit_rcu() you want to do:
>
> if (!in_interrupt() && local_softirq_pending()) {
> hrtimer_rearm();
> invoke_softirq();
> }
Actually it's worse. Assume the CPU on which this happens has the
jiffies duty. As the timer does not fire, jiffies become stale. So
anything which relies on jiffies going forward will get stuck until some
other condition breaks the tie. That's going to be fun to debug :)
Thanks,
tglx
Powered by blists - more mailing lists