[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874jmc8654.wl-maz@kernel.org>
Date: Mon, 10 Jul 2023 10:31:19 +0100
From: Marc Zyngier <maz@...nel.org>
To: "Aiqun(Maria) Yu" <quic_aiquny@...cinc.com>
Cc: <will@...nel.org>, <corbet@....net>, <catalin.marinas@....com>,
<quic_pkondeti@...cinc.com>, <quic_kaushalk@...cinc.com>,
<quic_satyap@...cinc.com>, <quic_shashim@...cinc.com>,
<quic_songxue@...cinc.com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH] arm64: Add the arm64.nolse_atomics command line option
On Mon, 10 Jul 2023 09:19:54 +0100,
"Aiqun(Maria) Yu" <quic_aiquny@...cinc.com> wrote:
>
> On 7/10/2023 3:27 PM, Marc Zyngier wrote:
> > On Mon, 10 Jul 2023 06:59:55 +0100,
> > Maria Yu <quic_aiquny@...cinc.com> wrote:
> >>
> >> In order to be able to disable lse_atomic even if cpu
> >> support it, most likely because of memory controller
> >> cannot deal with the lse atomic instructions, use a
> >> new idreg override to deal with it.
> >
> > In general, the idreg overrides are *not* there to paper over HW bugs.
> > They are there to force the kernel to use or disable a feature for
> > performance reason or to guide the *enabling* of a feature, but not
> > because the HW is broken.
> >
> > The broken status of a HW platform must also be documented so that we
> > know what to expect when we look at, for example, a bad case of memory
> > corruption (something I'd expect to see on a system that only
> > partially implements atomic memory operations).
> >
>
> good idea. A noc error would be happened if the lse atomic instruction
> happened during a memory controller doesn't support lse atomic
> instructions.
> I can put the information in next patchset comment message. Pls feel
> free to let know if there is other place to have this kind of
> information with.
For a start, Documentation/arch/arm64/silicon-errata.rst should
contain an entry for the actual erratum, and a description of the
symptoms of the issue (you're mentioning a "noc error": how is that
reported to the CPU?).
The workaround should also be detected at runtime -- we cannot rely on
the user to provide a command-line argument to disable an essential
feature that anyone has taken for granted for most of a decade...
[...]
> >> @@ -185,6 +195,7 @@ static const struct {
> >> { "arm64.nomops", "id_aa64isar2.mops=0" },
> >> { "arm64.nomte", "id_aa64pfr1.mte=0" },
> >> { "nokaslr", "arm64_sw.nokaslr=1" },
> >> + { "arm64.nolse_atomic", "id_aa64isar0.atomic=0" },
> >
> > And what of 32bit?
This particular question still stands, as it is likely to affect VMs.
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists