[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2b69fbd1-067e-4ff4-8ea4-88e32763209a@gmail.com>
Date: Sat, 19 Jul 2025 04:45:58 +0800
From: Nick Chan <towinchenmi@...il.com>
To: Mark Rutland <mark.rutland@....com>
Cc: Will Deacon <will@...nel.org>, Rob Herring <robh@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley
<conor+dt@...nel.org>, Catalin Marinas <catalin.marinas@....com>,
Janne Grunau <j@...nau.net>, Alyssa Rosenzweig <alyssa@...enzweig.io>,
Neal Gompa <neal@...pa.dev>, Sven Peter <sven@...nel.org>,
Marc Zyngier <maz@...nel.org>, linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, devicetree@...r.kernel.org,
asahi@...ts.linux.dev, linux-kernel@...r.kernel.org,
Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>
Subject: Re: [PATCH RESEND v7 00/21] drivers/perf: apple_m1: Add Apple A7-A11,
T2 SoC support
Mark Rutland 於 2025/7/18 夜晚11:01 寫道:
> On Fri, Jul 18, 2025 at 01:00:45AM +0800, Nick Chan wrote:
>> On 17/7/2025 23:05, Mark Rutland wrote:
>>> On Mon, Jul 14, 2025 at 11:59:36PM +0800, Nick Chan wrote:
>>>> Will Deacon 於 2025/7/14 夜晚11:12 寫道:
>>>>> On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote:
>>>>>> Patch 8-12 adds support for the older SoCs.
>>>>> ... but I'm not sure if anybody actually cares about these older SoCs
>>>>> and, even if they do, what the state of the rest of Linux is on those
>>>>> parts. I recall horror stories about the OS being quietly migrated
>>>>> between CPUs with incompatible features, at which point I think we have
>>>>> to question whether we actually care about supporting this hardware.
>>>> The "horror" story you mentioned is about Apple A10/A10X/T2, which
>>>> has a big little switcher integrated into the cpufreq block, so when the
>>>> cpufreq driver switch between states in the same way as on other
>>>> SoCs, on these SoCs that would silently cause a CPU migration. There
>>>> is only one incompatible feature that I am aware of which is 32-bit EL0
>>>> support.
>>> Surely the MIDR/REVIDR/AIDR also change?
>> They do not change. ID_AA64PFR0_EL1 also does not change (fixed 0x12).
>> What *does* change however is MPIDR. (P-cores has bit 16 set while
>> E-cores do not)
> The MPIDR changing isn't ok either. You might get away with that today,
> but that's not supposed to change behind the back of the kernel.
>
> Is there anything else that can change, or are we absolutley certain
> that *only* MPIDR changes?
Only MPIDR changes, and the state of bit 16 in MPIDR is consistent across all PEs. (At any
given moment, either all PEs are backed by efficiency cores, or all backed by performance
cores)
>
>>>> As mentioned above, it does all work fine when CONFIG_EXPERT is not
>>>> enabled, and if it is enabled, then 32-bit process may crash with illegal
>>>> instruction but everything else will still works fine.
>>> I don't think that's quite true, unless these parts are also violating
>>> the architecture.
>>>
>>> If the CPU doesn't implement AArch32, then an ERET to AArch32 is
>>> illegal. The way illegal exception returns are handled means that this
>>> will result in a (fatal) illegal execution state exception being taken
>>> from the exception return code in the kernel, not an UNDEF being taken
>>> from userspace that would result in a SIGILL.
>> Speaking from experience, when testing with the userspace cpufreq governor,
>> trying to run AArch32 code on the ecores really does result in illegal
>> instruction for that process while everything else remains fine.
>>
>> Referencing ID_AA64PFR0_EL1, the E-cores does claim to support
>> AArch32 EL0, even though they could not execute it for real.
> Ok, so that's a clear violation of the architecture, and doesn't fill me
> with confidence about anything else.
Regarding this, the hardware also needs to handle the case where the PE is already in AArch32
EL0 and migration to E-cores is attempted. In this case there is no exception return happening so
the behavior of the hardware is not as bad as it sounds.
>
>>> I do not think that we should pretend to support hardware with silent
>>> microarchitectural migration. So at the very least, we do not care about
>>> A10/A10X/T2.
>> As explained above, what actually happens on the hardware is different
>> from what you believed, so please do reconsider.
> Different certainly, but still problematic.
>
> I maintain that we should not pretend to support this hardware.
>
> Mark.
>
Nick Chan
Powered by blists - more mailing lists