[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86v8gym0ys.wl-maz@kernel.org>
Date: Fri, 12 May 2023 09:02:35 +0100
From: Marc Zyngier <maz@...nel.org>
To: Douglas Anderson <dianders@...omium.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Rob Herring <robh+dt@...nel.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
Matthias Brugger <matthias.bgg@...il.com>,
devicetree@...r.kernel.org, linux-mediatek@...ts.infradead.org,
wenst@...omium.org, Eddie Huang <eddie.huang@...iatek.com>,
Allen-KH Cheng <allen-kh.cheng@...iatek.com>,
Ben Ho <Ben.Ho@...iatek.com>, Weiyi Lu <weiyi.lu@...iatek.com>,
AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>,
linux-arm-kernel@...ts.infradead.org,
Tinghan Shen <tinghan.shen@...iatek.com>, jwerner@...omium.org,
Hsin-Hsiung Wang <hsin-hsiung.wang@...iatek.com>,
yidilin@...omium.org, Seiya Wang <seiya.wang@...iatek.com>,
Conor Dooley <conor+dt@...nel.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/6] dt-bindings: interrupt-controller: arm,gic-v3: Add quirk for Mediatek SoCs w/ broken FW
On Thu, 11 May 2023 23:05:35 +0100,
Douglas Anderson <dianders@...omium.org> wrote:
>
> When trying to turn on the "pseudo NMI" kernel feature in Linux, it
> was discovered that all Mediatek-based Chromebooks that ever shipped
> (at least ones with GICv3) had a firmware bug where they wouldn't save
> certain GIC "GICR" registers properly. If a processor ever entered a
> suspend/idle mode where the GICR registers lost state then they'd be
> reset to their default state.
>
> As a result of the bug, if you try to enable "pseudo NMIs" on the
> affected devices then certain interrupts will unexpectedly get
> promoted to be "pseudo NMIs" and cause crashes / freezes / general
> mayhem.
>
> ChromeOS is looking to start turning on "pseudo NMIs" in production to
> make crash reports more actionable. To do so, we will release firmware
> updates for at least some of the affected Mediatek Chromebooks.
> However, even when we update the firmware of a Chromebook it's always
> possible that a user will end up booting with old firmware. We need to
> be able to detect when we're running with firmware that will crash and
> burn if pseudo NMIs are enabled.
>
> The current plan is:
> * Update the device trees of all affected Chromebooks to include the
> 'mediatek,gicr-save-quirk' property. The kernel can use this to know
> not to enable certain features like "pseudo NMI". NOTE: device trees
> for Chromebooks are never baked into the firmware but are bundled
> with the kernel. A kernel will never be configured to use "pseudo
> NMIs" and be bundled with an old device tree.
> * When we get a fixed firmware for one of these Chromebooks, it will
> patch the device tree to remove this property.
Since you're in control of distributing the FW together with the
kernel, I assume you're also in control of the command line. Why can't
that firmware pass the option enabling the pseudo-NMI support,
dispensing ourselves from all of this?
>
> For some details, you can also see the public bug
> <https://issuetracker.google.com/281831288>
>
> Signed-off-by: Douglas Anderson <dianders@...omium.org>
> ---
>
> .../bindings/interrupt-controller/arm,gic-v3.yaml | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml
> index 92117261e1e1..8c251caae537 100644
> --- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml
> +++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml
> @@ -166,6 +166,12 @@ properties:
> resets:
> maxItems: 1
>
> + mediatek,gicr-save-quirk:
I think this deserves something *much* stronger that outlines what is
wrong, because this is not just a quirk. This is a failure to even
remotely grasp the requirements of the architecture (and to use
standard, public code that would have done it correctly). Something
like "mediatek,broken-save-restore-fw" would be more adequate.
> + type: boolean
> + description:
> + Asserts that the firmware on this device has issues saving and restoring
> + GICR registers when CPUs are powered off.
Nit: not the the CPUs, but the GIC redistributors.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists