[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <868qfipfij.wl-maz@kernel.org>
Date: Thu, 04 Dec 2025 13:01:40 +0000
From: Marc Zyngier <maz@...nel.org>
To: Pavan Kondeti <pavan.kondeti@....qualcomm.com>
Cc: Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org,
linux-arm-msm@...r.kernel.org
Subject: Re: Alternative to arm64.nopauth cmdline for disabling Pointer Authentication
[dropping Ricardo, as his address bounces]
On Thu, 04 Dec 2025 10:36:12 +0000,
Pavan Kondeti <pavan.kondeti@....qualcomm.com> wrote:
>
> Hi Marc,
>
> On Thu, Dec 04, 2025 at 09:15:29AM +0000, Marc Zyngier wrote:
> > On Thu, 04 Dec 2025 04:07:15 +0000,
> > Pavan Kondeti <pavan.kondeti@....qualcomm.com> wrote:
> > >
> > > Hi
> > >
> > > The pointer authentication feature (PAuth) is only supported on
> > > 0-3 CPUs but it is not supported on 4-7 CPUS on QCS8300.
> >
> > On what grounds? Hardware incompatibility? I seriously doubt it, since
> > nobody glues pre-8.3 CPUs to anything more modern. Or, as I expect it,
> > a firmware implemented with little understanding of what is required?
>
> I don't know the answer to this question. I will talk to folks who may
> know answer to this question and get back.
>
> Can you please elaborate on the firmware part you are talking here? I
> see that Linux runs at EL2 and AA64ISAR1 register values on CPU#0 (A78)
> indicates that PAuth is supported but not for CPU#4 (A55). I am told, there
> are no other controls outside EL2 (trap) to manipulate this feature. So,
> I am assuming that this is indeed reflecting the HW.
Neither A78 nor A55 have PAuth. They are both firmly ARMv8.2 CPUs, and
predate this functionality. So I guess that there are only two possible
outcomes:
- either the FW is indeed not at fault, but that you have a *third*
type of CPU that is at least 8.3 in the mix
- or that you misidentified the CPUs that are on your system, they
have PAuth, and the firmware is borked
Which one is it?
>
> >
> > > The ARM64 cpufeature discovery code expects late CPUs to have
> > > this feature if boot CPU feature has it since PAuth is enabled
> > > early. When a conflict like this is detected, the late CPUs are
> > > not allowed to boot. It is expected that system will continue
> > > to be functional with CPUs with Pauth feature supported and enabled.
> > > This is not a desired behavior in production.
> >
> > What is even less desirable is to produce this sort of contraption.
> >
> > > We started seeing this problem when Linux is booted in EL2. When Linux
> > > is running under Gunyah (Type-1 hypervisor), Pointer Authentication
> > > feature is hidden from EL1 via HCR_EL2.TID3.
> > >
> > > arm64.nopauth can be passed on kernel cmdline to disable the feature
> > > in kernel so that all all CPUs can boot on QCS8300. I am told
> > > maintaining a custom kernel commandline per SoC in a Generic OS
> > > distribution is not recommended and asked to discuss the problem with
> > > the comunity [1]
> >
> > Well, you get to own the problem you have created for yourself. You
> > build hardware/firmware that cannot run generic SW, and yet you want
> > generic SW to run seamlessly on it. Spot the issue?
> >
> > > This patch [2] from Catalin adds a devicetree property under memory {}
> > > to disable MTE. I believe this work predates the id-reg override
> > > mechanism. However, this made me think if workarounds like this can be
> > > detected via devicetree, for example a property under cpu { } node.
> >
> > Not only it predates it, but it also doesn't work in general. For a
> > start, it is DT specific. How are you going to make that work for
> > ACPI? I know you don't care, but I do.
>
> Point taken. I understand that this does not fall under errata but is
> there a possiblity to introduce an Errata targeting CPU#0 MIDR and
> disabling the Pointer authentication? I understand that if there is
> another Qualcomm SoC that exists with all CPUs supporting pointer
> authentication with same MIDR, we may be disabling the feature but this
> is something I can check internally.
>
> >
> > > Given that what we put in `chosen { bootargs="" }` kernel under
> > > respective SoC devicetree can be overridden by bootloader, should we
> > > have a **sticky** cmdline to specify critical workarounds like this?
> > > This would be more generic than introducing any new parameters.
> >
> > You already have a way to have a sticky command-line, by building it
> > into the kernel. Yes, I understand that this isn't what you want, but:
> >
> > (1) a user should be allowed to pass the kernel command-line *they*
> > want, not what someone has decided for them
>
> Agreed. This is what made me to ask the question. Should kernel have a
> sticky command line which may have critical workarounds like this?
Absolutely *not*. You are not in charge of defining what is good for
the user. If the user themselves want that, they have plenty of ways
to achieve that particular goal already. Put it in the bootargs
string, in the kernel build, in a grub config file, as a u-boot
hack... There is an infinite number of choices already, and we don't
need an extra one to hide how ugly their HW is.
> > (2) the generic mechanism exists, doesn't rely on additional firmware
> > specifications, and is used for a whole lot of other QC platforms
> > suffering from the same issue of broken firmware. What are you
> > going to do for these?
>
> The generic mechanism, you mean bootloader passing the kernel cmdline
> with `arm64.nopauth`? or something else.
Exactly that. This is the mechanism by which we instruct the kernel
not to use a particular feature if it can avoid it. It is easy to add,
doesn't depend on new esoteric firmware interfaces, and is a constant
reminder that you are dealing with stuff that isn't fit for purpose.
> > (3) what if you, by miracle, happened to *fix* the firmware?
>
> As I have asked above, the firmware part is not clear.
Well, your description of the root cause of the problem isn't clear
either, so we're even! ;-)
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists