[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <92d0a120d3303243d7bd72188c4f5974f525975a.camel@xry111.site>
Date: Tue, 01 Apr 2025 20:34:43 +0800
From: Xi Ruoyao <xry111@...111.site>
To: Marc Zyngier <maz@...nel.org>
Cc: Anshuman Khandual <anshuman.khandual@....com>, James Morse
<james.morse@....com>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, Shameer Kolothum
<shameerali.kolothum.thodi@...wei.com>, Mingcong Bai <jeffbai@...c.io>
Subject: Re: [PATCH] arm64: Add overrride for MPAM
On Tue, 2025-04-01 at 13:09 +0100, Marc Zyngier wrote:
> On Tue, 01 Apr 2025 12:47:03 +0100,
> Xi Ruoyao <xry111@...111.site> wrote:
> >
> > On Tue, 2025-04-01 at 14:04 +0530, Anshuman Khandual wrote:
> > > On 4/1/25 11:26, Xi Ruoyao wrote:
> > > > As the message of the commit 09e6b306f3ba ("arm64: cpufeature: discover
> > > > CPU support for MPAM") already states, if a buggy firmware fails to
> > > > either enable MPAM or emulate the trap as if it were disabled, the
> > > > kernel will just fail to boot. While upgrading the firmware should be
> > > > the best solution, we have some hardware of which the vender have made
> > > > no response 2 months after we requested a firmware update. Allow
> > > > overriding it so our devices don't become some e-waste.
> > >
> > > There could be similar problems, where firmware might not enable arch
> > > features as required. Just wondering if there is a platform policy in
> > > place for enabling id-reg overrides for working around such scenarios
> > > to prevent a kernel crash etc ?
> >
> > In https://lore.kernel.org/all/87jzcfsuep.wl-maz@kernel.org/:
> >
> > > For such cases, when MPAM is incorrectly advertised, can we have kernel
> > > command line parameter like mpam=0 to override it's detection?
> >
> > We could, but only when we can confirm what the problem is.
> >
> > And there was prior arts like:
> >
> > commit 892f7237b3ffb090f1b1f1e55fe7c50664405aed
> > Author: Marc Zyngier <maz@...nel.org>
> > Date: Wed Jul 20 11:52:19 2022 +0100
> >
> > arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
> >
> > Even if we are now able to tell the kernel to avoid exposing SVE/SME
> > from the command line, we still have a couple of places where we
> > unconditionally access the ZCR_EL1 (resp. SMCR_EL1) registers.
> >
> > On systems with broken firmwares, this results in a crash even if
> > arm64.nosve (resp. arm64.nosme) was passed on the command-line.
> >
> > To avoid this, only update cpuinfo_arm64::reg_{zcr,smcr} once
> > we have computed the sanitised version for the corresponding
> > feature registers (ID_AA64PFR0 for SVE, and ID_AA64PFR1 for
> > SME). This results in some minor refactoring.
>
> That particular patch has caused quite a few issues, see d3c7c48d004f.
> So don't use it as a reference.
>
> Now, while I think an option is probably acceptable in the face of an
> unresponsive vendor, I don't think the way you implement it is the
> correct approach.
>
> It should be possible to handle the override in the assembly code,
> like we do for other bits and pieces, and deal with MPAMIDR_EL1 later
> down the line, once the sanitised ID registers are known to be valid.
Ok I'll try it.
--
Xi Ruoyao <xry111@...111.site>
School of Aerospace Science and Technology, Xidian University
Powered by blists - more mailing lists