[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f6fe0c6-7c11-4f16-bed4-db4de675b002@gmail.com>
Date: Tue, 18 Nov 2025 07:15:50 +0530
From: Jaikiran Pai <jai.forums2013@...il.com>
To: D Scott Phillips <scott@...amperecomputing.com>,
Catalin Marinas <catalin.marinas@....com>,
James Clark <james.clark@...aro.org>, James Morse <james.morse@....com>,
Joey Gouly <joey.gouly@....com>, Kevin Brodsky <kevin.brodsky@....com>,
Marc Zyngier <maz@...nel.org>, Mark Brown <broonie@...nel.org>,
Mark Rutland <mark.rutland@....com>, Oliver Upton <oliver.upton@...ux.dev>,
"Rob Herring (Arm)" <robh@...nel.org>,
Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
Shiqi Liu <shiqiliu@...t.edu.cn>, Will Deacon <will@...nel.org>,
Yicong Yang <yangyicong@...ilicon.com>, kvmarm@...ts.linux.dev,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] arm64: errata: Work around AmpereOne's erratum
AC04_CPU_23
On 17/11/25 10:47 pm, D Scott Phillips wrote:
> Jaikiran Pai <jai.forums2013@...il.com> writes:
>
>> Hello Scott,
>>
>> On 14/05/25 12:15 am, D Scott Phillips wrote:
>>> On AmpereOne AC04, updates to HCR_EL2 can rarely corrupt simultaneous
>>> translations for data addresses initiated by load/store instructions.
>>> Only instruction initiated translations are vulnerable, not translations
>>> from prefetches for example. A DSB before the store to HCR_EL2 is
>>> sufficient to prevent older instructions from hitting the window for
>>> corruption, and an ISB after is sufficient to prevent younger
>>> instructions from hitting the window for corruption.
>> I see that this patch enables the workaround only for AmpereOne AC04
>> systems. Do you happen to know if the underlying issue for which this
>> patch was introduced, impacts (or can impact) AmpereOne AC03 systems too:
> Hi Jaikiran, this issue impacts ac04 only, it is not present on ac03.
Thank you Scott for the quick confirmation.
We have been investigating an issue on AC03 (running Oracle Linux as a
VM) where some memory writes (stores) are lost especially when the OS
appears to have accumulated high buf/cache usage (monitored through free
-h). That investigation, backed by a trivial C reproducer, is still
ongoing and we are trying to understand what could be causing it. The
issue description here made us curious whether it's the same issue we
are running into and since this patch wasn't applied on AC03, we decided
to check once.
While at it, if you have any inputs (tools/commands) that you typically
use to narrow down such issues, I would be happy to experiment with if
feasible. Right now we are focusing on the kernel itself and checking
which specific kernel versions can reproduce it. We have been able to
reproduce it consistently on 5.15.x and 5.16.x and we plan to try it
with other kernel versions all the way upto 6.12. That should tell us if
the issue we are encountering has already been addressed in any specific
kernel version.
Given that you noted this patch isn't relevant for AC03, I don't plan to
further reply-all to this PATCH discussion, but if you would like me to
keep you updated with this investigation (I would love to get some
inputs and provide updates as we go along) then please let me know and I
will communicate with you over your email (or any other relevant forum
you suggest).
Thank you again for the quick response.
-Jaikiran
Powered by blists - more mailing lists