[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAhV-H7Tko290LSCJPuVFE2qds81N4C=8RPz4edC-xddFvZGjA@mail.gmail.com>
Date: Wed, 26 Mar 2025 09:20:05 +0800
From: Huacai Chen <chenhuacai@...nel.org>
To: Marco Crivellari <marco.crivellari@...e.com>
Cc: "Maciej W. Rozycki" <macro@...am.me.uk>, linux-mips@...r.kernel.org,
linux-kernel@...r.kernel.org, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Frederic Weisbecker <frederic@...nel.org>, Anna-Maria Behnsen <anna-maria@...utronix.de>,
Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v6 1/1] MIPS: Fix idle VS timer enqueue
On Tue, Mar 25, 2025 at 10:09 PM Marco Crivellari
<marco.crivellari@...e.com> wrote:
>
> Hi Maciej,
>
> Thanks a lot for all the information.
>
> > Unlike `__r4k_wait' that code is not in a `.set noreorder' region and
> > the assembler will therefore resolve the hazard by inserting a NOP where
> > required by the architecture level requested (with `-march=...', etc.).
> > Try compiling that function for a MIPS III configuration such as
> > decstation_r4k_defconfig or just by hand with `-march=mips3' and see
> > what machine code is produced.
>
> I tried with the configuration you suggested, and indeed I can see a NOP.
>
> > Whatever manual you quote it refers to MIPS Release 2, which is only
> > dated 2003
>
> About the MIPS manual, anyhow, it is "MIPS32 M4 Processor Core" (year 2008).
> Maybe I've also picked the wrong manual.
>
> I've also found the manual you mentioned online, thanks.
>
> > Best is to avoid using a `.set noreorder' region in the first place.
> > But is it really needed here? Does the rollback area have to be of a
> > hardcoded size rather than one calculated by the assembler based on
> > actual machine code produced? It seems to me having it calculated would
> > reduce complexity here and let us use the EI instruction where available
> > as well.
>
> Well, considering the complexity and how the code looks fragile even with
> a small change, yes, that's likely better to avoid noreorder.
In my opinion keeping "noreorder" is the simplest, which means just
add an "nop" after MFC0 in the current version.
Huacai
>
> I think I'm going to need some guidance here.
> Please, correct me where something is wrong.
>
> 1)
> When you say "let us use the EI instruction where available" are you
> referring to do
> something like below?
>
> #if defined(CONFIG_CPU_HAS_DIEI)
> ei
> #else
> MFC0 t0, CP0_STATUS
> ori t0, 0x1f
> xori t0, 0x1e
> MTC0 t0, CP0_STATUS
> #endif
>
> 2)
> Removing "noreorder" would let the compiler add "nops" where they are needed.
> But that still means the 3 ssnop and ehb are still needed, right?
>
> My subsequent dumb question is: there is the guarantee that the
> compiler will not
> reorder / change something we did?
> This question also came after reading the manual you quoted (paragraph
> "Coprocessor Hazards"):
>
> "For example, after an mtc0 to the Status register which changes an
> interrupt mask bit,
> there will be two further instructions before the interrupt is
> actually enabled or disabled.
> [...]
> To cope with these situations usually requires the programmer to take explicit
> action to prevent the assembler from scheduling inappropriate
> instructions after a
> dangerous mtc0. This is done by using the .set noreorder directive,
> discussed below"
>
> 3)
> Considering the size is determined by the compiler, the check about
> the idle interrupt
> size region should not be needed, correct?
>
> 4)
> ori and PTR_ADDIU should be removed of course from the rollback handler macro.
> Can I have some hints about the needed change?
> Using QEMU is always working, so I'm not sure if what I will change is
> also correct.
>
>
> Many thanks in advance, also for your time!
>
>
>
>
> On Fri, Mar 21, 2025 at 9:11 PM Maciej W. Rozycki <macro@...am.me.uk> wrote:
> >
> > On Fri, 21 Mar 2025, Marco Crivellari wrote:
> >
> > > > This instruction sequence still suffers from the coprocessor move delay
> > > > hazard. How many times do I need to request to get it fixed (counting
> > > > three so far)?
> > >
> > > Can I have more details about this?
> > >
> > > I can see it is the same code present also in local_irq_enable()
> > > (arch_local_irq_enable()),
> >
> > Unlike `__r4k_wait' that code is not in a `.set noreorder' region and
> > the assembler will therefore resolve the hazard by inserting a NOP where
> > required by the architecture level requested (with `-march=...', etc.).
> > Try compiling that function for a MIPS III configuration such as
> > decstation_r4k_defconfig or just by hand with `-march=mips3' and see
> > what machine code is produced.
> >
> > > and from the manual I've seen:
> > >
> > > "The Spacing column shown in Table 2.6 and Table 2.7 indicates the
> > > number of unrelated instructions (such as NOPs or SSNOPs) that,
> > > prior to the capabilities of Release 2, would need to be placed
> > > between the producer and consumer of the hazard in order to ensure
> > > that the effects of the first instruction are seen by the second instruction."
> > >
> > > The "Spacing column" value is 3, indeed.
> > >
> > > "With the hazard elimination instructions available in Release 2, the
> > > preferred method to eliminate hazards is to place one of the
> > > instructions listed in Table 2.8 between the producer and consumer of the
> > > hazard. Execution hazards can be removed by using the EHB [...]"
> >
> > Whatever manual you quote it refers to MIPS Release 2, which is only
> > dated 2003 (following Release 1 from 2001), but `__r4k_wait' has to
> > continue handling older architecture revisions going back to MIPS III
> > ISA from 1991. We also support MIPS I ISA from 1985 which has further
> > instruction scheduling requirements, but `__r4k_wait' is for newer
> > processors only, because older ones had no WAIT instruction, so we can
> > ignore them (but note that the MIPS I load delay slot is regardless
> > observed in current code in the form of a NOP inserted after LONG_L).
> >
> > > What am I missing?
> >
> > This is common MIPS knowledge really, encoded in the GNU toolchain and
> > especially GAS since forever. While I can't cite a canonical reference,
> > the hazard is listed e.g. in Table 13.1 "Instructions with scheduling
> > implications" and Table 13.3 "R4xxx/R5000 Coprocessor 0 Hazards" from
> > "IDT MIPS Microprocessor Family Software Reference Manual," Version 2.0,
> > from October 1996. I do believe the document is available online.
> >
> > I'm fairly sure the hazard is also listed there in Dominic Sweetman's
> > "See MIPS Run Linux," but I don't have my copy handy right now.
> >
> > Best is to avoid using a `.set noreorder' region in the first place.
> > But is it really needed here? Does the rollback area have to be of a
> > hardcoded size rather than one calculated by the assembler based on
> > actual machine code produced? It seems to me having it calculated would
> > reduce complexity here and let us use the EI instruction where available
> > as well.
> >
> > HTH,
> >
> > Maciej
>
>
>
> --
>
> Marco Crivellari
>
> L3 Support Engineer, Technology & Product
>
>
>
>
> marco.crivellari@...e.com
Powered by blists - more mailing lists