[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=V=Jqcq+oS1xQr71eqOHSbW4VHT=_AEDU6upjwsPKGKdg@mail.gmail.com>
Date: Mon, 30 Oct 2023 16:19:55 -0700
From: Doug Anderson <dianders@...omium.org>
To: Mark Rutland <mark.rutland@....com>
Cc: Marc Zyngier <maz@...nel.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Chen-Yu Tsai <wenst@...omium.org>,
Amit Daniel Kachhap <amit.kachhap@....com>,
AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>,
James Morse <james.morse@....com>,
Joey Gouly <joey.gouly@....com>,
Mark Brown <broonie@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-mediatek@...ts.infradead.org
Subject: Re: [PATCH 1/3] arm64: Disable GiC priorities on Mediatek devices w/
firmware issues
Hi,
On Wed, Oct 18, 2023 at 4:01 AM Mark Rutland <mark.rutland@....com> wrote:
>
> On Fri, Oct 06, 2023 at 03:15:51PM -0700, Douglas Anderson wrote:
> > In commit 44bd78dd2b88 ("irqchip/gic-v3: Disable pseudo NMIs on
> > Mediatek devices w/ firmware issues") we added a method for detecting
> > Mediatek devices with broken firmware and disabled pseudo-NMI. While
> > that worked, it didn't address the problem at a deep enough level.
> >
> > The fundamental issue with this broken firmware is that it's not
> > saving and restoring several important GICR registers. The current
> > list is believed to be:
> > * GICR_NUM_IPRIORITYR
> > * GICR_CTLR
> > * GICR_ISPENDR0
> > * GICR_ISACTIVER0
> > * GICR_NSACR
> >
> > Pseudo-NMI didn't work because it was the only thing (currently) in
> > the kernel that relied on the broken registers, so forcing pseudo-NMI
> > off was an effective fix. However, it could be observed that calling
> > system_uses_irq_prio_masking() on these systems still returned
> > "true". That caused confusion and led to the need for
> > commit a07a59415217 ("arm64: smp: avoid NMI IPIs with broken MediaTek
> > FW"). It's worried that the incorrect value returned by
> > system_uses_irq_prio_masking() on these systems will continue to
> > confuse future developers.
> >
> > Let's fix the issue a little more completely by disabling IRQ
> > priorities at a deeper level in the kernel. Once we do this we can
> > revert some of the other bits of code dealing with this quirk.
> >
> > Signed-off-by: Douglas Anderson <dianders@...omium.org>
> > ---
> >
> > arch/arm64/kernel/cpufeature.c | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 2806a2850e78..e35efab8efa9 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -2094,9 +2094,30 @@ static int __init early_enable_pseudo_nmi(char *p)
> > }
> > early_param("irqchip.gicv3_pseudo_nmi", early_enable_pseudo_nmi);
> >
> > +static bool are_gic_priorities_broken(void)
> > +{
> > + bool is_broken = false;
> > + struct device_node *np;
> > +
> > + /*
> > + * Detect broken Mediatek firmware that doesn't properly save and
> > + * restore GIC priorities.
> > + */
> > + np = of_find_compatible_node(NULL, NULL, "arm,gic-v3");
> > + if (np) {
> > + is_broken = of_property_read_bool(np, "mediatek,broken-save-restore-fw");
> > + of_node_put(np);
> > + }
> > +
> > + return is_broken;
> > +}
>
> I'm definitely in favour of detecting this in the cpucap, but I think it'd be
> better to parse the DT once on the boot CPU rather than on each CPU every time
> it's brought up.
>
> I think if we add something like:
>
> #ifdef CONFIG_ARM64_PSEUDO_NMI
> static void detect_system_supports_pseudo_nmi(void)
> {
> struct device_node *np;
>
> if (!enable_pseudo_nmi)
> return;
>
> /*
> * Detect broken Mediatek firmware that doesn't properly save and
> * restore GIC priorities.
> */
> np = of_find_compatible_node(NULL, NULL, "arm,gic-v3");
> if (np && of_property_read_bool(np, "mediatek,broken-save-restore-fw")) {
> pr_info("Pseudo-NMI disabled due to Mediatek Chromebook GICR save problem");
> enable_pseudo_nmi = false;
> }
> of_node_put(np);
> }
> #endif /* CONFIG_ARM64_PSEUDO_NMI */
> static inline void detect_system_supports_pseudo_nmi(void) { }
> #endif
>
> ... then we can call that from init_cpu_features() before we call
> setup_boot_cpu_capabilities(), and then the existing logic in
> can_use_gic_priorities() should just work as that returns the value of
> enable_pseudo_nmi.
>
> Note: of_node_put(NULL) does nothing, like kfree(NULL), so it's fine for that
> to be called in the !np case.
>
> Would you be happy to fold that in? I'm happy with a Suggested-by tag if so. :)
Yup, that looks good to me and I can fold it in (fixing a few nits
like missing "\n" and adding __init to the function). I'll wait to get
maintainers opinions on whether to fold patch #3 in here and then send
a v2.
-Doug
Powered by blists - more mailing lists