[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a950aa5-fdd8-f983-0411-4b39ade596f4@quicinc.com>
Date: Tue, 11 Jul 2023 18:12:48 +0800
From: "Aiqun(Maria) Yu" <quic_aiquny@...cinc.com>
To: Marc Zyngier <maz@...nel.org>
CC: <will@...nel.org>, <corbet@....net>, <catalin.marinas@....com>,
<quic_pkondeti@...cinc.com>, <quic_kaushalk@...cinc.com>,
<quic_satyap@...cinc.com>, <quic_shashim@...cinc.com>,
<quic_songxue@...cinc.com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH] arm64: Add the arm64.nolse_atomics command line option
On 7/11/2023 2:57 PM, Marc Zyngier wrote:
> On Tue, 11 Jul 2023 04:30:44 +0100,
> "Aiqun(Maria) Yu" <quic_aiquny@...cinc.com> wrote:
>>
>> On 7/10/2023 5:31 PM, Marc Zyngier wrote:
>>> On Mon, 10 Jul 2023 09:19:54 +0100,
>>> "Aiqun(Maria) Yu" <quic_aiquny@...cinc.com> wrote:
>>>>
>>>> On 7/10/2023 3:27 PM, Marc Zyngier wrote:
>>>>> On Mon, 10 Jul 2023 06:59:55 +0100,
>>>>> Maria Yu <quic_aiquny@...cinc.com> wrote:
>>>>>>
>>>>>> In order to be able to disable lse_atomic even if cpu
>>>>>> support it, most likely because of memory controller
>>>>>> cannot deal with the lse atomic instructions, use a
>>>>>> new idreg override to deal with it.
>>>>>
>>>>> In general, the idreg overrides are *not* there to paper over HW bugs.
>>>>> They are there to force the kernel to use or disable a feature for
>>>>> performance reason or to guide the *enabling* of a feature, but not
>>>>> because the HW is broken.
>>>>>
>>>>> The broken status of a HW platform must also be documented so that we
>>>>> know what to expect when we look at, for example, a bad case of memory
>>>>> corruption (something I'd expect to see on a system that only
>>>>> partially implements atomic memory operations).
>>>>>
>>>>
>>>> good idea. A noc error would be happened if the lse atomic instruction
>>>> happened during a memory controller doesn't support lse atomic
>>>> instructions.
>>>> I can put the information in next patchset comment message. Pls feel
>>>> free to let know if there is other place to have this kind of
>>>> information with.
>>>
>>> For a start, Documentation/arch/arm64/silicon-errata.rst should
>>> contain an entry for the actual erratum, and a description of the
>>> symptoms of the issue (you're mentioning a "noc error": how is that
>>> reported to the CPU?).
>>
>> This is not a cpu's errata as my understanding. It is the DDR
>> subsystem which don't have the LSE atomic feature supported.
>
> CPU or not doesn't matter. We also track system errata.
Thank you for clarify on this.
If I am correct understanding, are you suggesting system errata with DT
seperate compatible (or similar) to runtime disable this feature instead
of idreg override with arm64.nolse options?
While it is better to finally affect the host arm64_ftr_regs value since
it can also derived to kvm sys reg as well.
>
>>>
>>> The workaround should also be detected at runtime -- we cannot rely on
>>> the user to provide a command-line argument to disable an essential
>>> feature that anyone has taken for granted for most of a decade...
>>
>> We are also seeking help from DDR Subsystem POC to see whether it is
>> possible to detect the LSE atomic feature support or not at runtime.
>
> Keying it off a DT compatible (or something similar) would work.
>
>> In my opinion, LSE atomic is a system level feature instead of a cpu
>> only feature. So currently solution we is that even if cpu support lse
>> atomic, but it still needed to be disabled if the cpu working with a
>> lse atomic not support by current system's DDR subsystem.
>
> In the absence of a detection mechanism for anything past the CPU,
> this is a moot point. At this stage, this is a bit like saying
> "writing to memory is a system thing, not only a CPU feature".
>
> And this also breaks KVM if these CPUs don't have FWB, as a guest can
> always map a piece of memory as non-cacheable, and trigger the issue
> you describe in your reply to Will, even if you hide the atomics on
> the host.
>
For the KVM part, per my understanding, as long as the current feature
id being overriden, the KVM system also get the current vcpu without the
lse atomic feature enabled.
KVM vcpu will read the sys reg from host arm64_ftr_regs which is already
been controled by the idreg_overrides.
check reference from:
https://elixir.bootlin.com/linux/v6.5-rc1/source/arch/arm64/kernel/cpufeature.c#L680
https://elixir.bootlin.com/linux/v6.5-rc1/source/arch/arm64/kvm/sys_regs.c#L1360
> M.
>
--
Thx and BRs,
Aiqun(Maria) Yu
Powered by blists - more mailing lists