[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAhV-H7cizXw-zta7sW+AKP3UiqRE52K7YdDhH7YoCr=LaCGWA@mail.gmail.com>
Date: Wed, 19 Feb 2025 11:03:25 +0800
From: Huacai Chen <chenhuacai@...nel.org>
To: Thomas Bogendoerfer <tsbogend@...ha.franken.de>
Cc: Marco Crivellari <marco.crivellari@...e.com>, linux-mips@...r.kernel.org,
linux-kernel@...r.kernel.org, Frederic Weisbecker <frederic@...nel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>, Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>, "Maciej W . Rozycki" <macro@...am.me.uk>
Subject: Re: [PATCH v2 1/1] MIPS: Fix idle VS timer enqueue
On Tue, Feb 18, 2025 at 9:51 PM Thomas Bogendoerfer
<tsbogend@...ha.franken.de> wrote:
>
> On Tue, Feb 18, 2025 at 08:14:43PM +0800, Huacai Chen wrote:
> > Hi, Thomas,
> >
> > On Tue, Feb 18, 2025 at 7:59 PM Thomas Bogendoerfer
> > <tsbogend@...ha.franken.de> wrote:
> > >
> > > On Tue, Feb 18, 2025 at 10:02:03AM +0100, Marco Crivellari wrote:
> > > > MIPS re-enables interrupts on its idle routine and performs
> > > > a TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.
> > > >
> > > > The IRQs firing between the check and the 'wait' instruction may set the
> > > > TIF_NEED_RESCHED flag. In order to deal with this possible race, IRQs
> > > > interrupting __r4k_wait() rollback their return address to the
> > > > beginning of __r4k_wait() so that TIF_NEED_RESCHED is checked
> > > > again before going back to sleep.
> > > >
> > > > However idle IRQs can also queue timers that may require a tick
> > > > reprogramming through a new generic idle loop iteration but those timers
> > > > would go unnoticed here because __r4k_wait() only checks
> > > > TIF_NEED_RESCHED. It doesn't check for pending timers.
> > > >
> > > > Fix this with fast-forwarding idle IRQs return address to the end of the
> > > > idle routine instead of the beginning, so that the generic idle loop
> > > > handles both TIF_NEED_RESCHED and pending timers.
> > > >
> > > > Signed-off-by: Marco Crivellari <marco.crivellari@...e.com>
> > > > ---
> > > > arch/mips/kernel/genex.S | 39 +++++++++++++++++++++------------------
> > > > arch/mips/kernel/idle.c | 1 -
> > > > 2 files changed, 21 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
> > > > index a572ce36a24f..9747b216648f 100644
> > > > --- a/arch/mips/kernel/genex.S
> > > > +++ b/arch/mips/kernel/genex.S
> > > > @@ -104,25 +104,27 @@ handle_vcei:
> > > >
> > > > __FINIT
> > > >
> > > > - .align 5 /* 32 byte rollback region */
> > > > + .align 5
> > > > LEAF(__r4k_wait)
> > > > .set push
> > > > .set noreorder
> > > > - /* start of rollback region */
> > > > - LONG_L t0, TI_FLAGS($28)
> > > > - nop
> > > > - andi t0, _TIF_NEED_RESCHED
> > > > - bnez t0, 1f
> > > > - nop
> > > > - nop
> > > > - nop
> > > > -#ifdef CONFIG_CPU_MICROMIPS
> > > > - nop
> > > > - nop
> > > > - nop
> > > > - nop
> > > > -#endif
> > >
> > > My quick search didnn't find the reason for the extra NOPs on MICROMIPS, but
> > > they are here for a purpose. I might still need them...
> > The original code needs #ifdef CONFIG_CPU_MICROMIPS because nop in
> > MICROMIPS is 2 bytes, so need another four nop to align. But _ssnop is
> > always 4 bytes, so we can remove #ifdefs.
>
> ic
>
> > > > + _ssnop
> > > > + _ssnop
> > > > + _ssnop
> > >
> > > instead of handcoded hazard nops, use __irq_enable_hazard for that
> > No, I don't think so, this region should make sure be 32 bytes on each
> > platform, but __irq_enable_hazard is not consistent, 3 _ssnop is the
> > fallback implementation but available for all MIPS.
>
> you are right for most cases, but there is one case
>
> #elif (defined(CONFIG_CPU_MIPSR1) && !defined(CONFIG_MIPS_ALCHEMY)) || \
> defined(CONFIG_CPU_BMIPS)
>
> which uses
>
> #define __irq_enable_hazard \
> ___ssnop; \
> ___ssnop; \
> ___ssnop; \
> ___ehb
>
> if MIPSR1 || BMIPS needs "rollback" handler 3 ssnnop would be wrong as
> irq enable hazard.
Emm, this is a problem. I think we can add _ehb after 3 _ssnop. And
then change the below "daddiu k0, 1" to "PTR_ADDIU k0, 5".
Maybe there is a better solution, but I think this is the simplest.
Huacai
>
> > > But I doubt this works, because the wait instruction is not aligned to
> > > a 32 byte boundary, but the code assuemes this, IMHO.
> > Why? After this patch we only use 4 byte instructions.
>
> I've should have looked at the compiled output, sorry for the noise
>
> Still this construct feels rather fragile.
>
> Thomas.
>
> --
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea. [ RFC1925, 2.3 ]
Powered by blists - more mailing lists