[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260109103416.GAaWDZqDYoyt3KRAE9@fat_crate.local>
Date: Fri, 9 Jan 2026 11:34:16 +0100
From: Borislav Petkov <bp@...en8.de>
To: Ruidong Tian <tianruidong@...ux.alibaba.com>
Cc: catalin.marinas@....com, will@...nel.org, lpieralisi@...nel.org,
guohanjun@...wei.com, sudeep.holla@....com,
xueshuai@...ux.alibaba.com, linux-kernel@...r.kernel.org,
linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
rafael@...nel.org, lenb@...nel.org, tony.luck@...el.com,
yazen.ghannam@....com, misono.tomohiro@...itsu.com,
fengwei_yin@...ux.alibaba.com
Subject: Re: [PATCH v5 00/17] ARM Error Source Table V2 Support
On Mon, Jan 05, 2026 at 05:12:25PM +0800, Ruidong Tian wrote:
> > What is a "RAS node"?
> A RAS node is the hardware interface for error reporting and control,
> consisting of one or more register sets (a collection of RAS records). It is
> responsible for error logging and interrupt signaling[0].
OMG, one more meaning for the word "node". Because we're not ambiguous enough.
/facepalm
> A single hardware component can feature multiple RAS nodes. For example, a
> memory controller is treated as a "RAS device", where each memory channel
> has its own RAS node. Interrupts generated by these nodes are typically
> aggregated into a single interrupt line managed at the RAS device level.
Nomenclaturial tragedy, I'd say.
> Comparison with x86 MCA:
>
> RAS record ≈ MCA bank.
> RAS node ≈ A set of MCA banks + CMCI on a core.
>
> The key difference lies in uncore handling: x86 typically maps uncore errors
> (like those from a memory controller) into core-based MCA banks. In
> contrast, ARM requires uncore components to provide their own standalone RAS
> nodes. When a component requires multiple such nodes, they are grouped and
> managed as a "RAS device" in AEST driver.
>
> [0]: https://developer.arm.com/documentation/ihi0100/latest
Yah, thanks for explaining.
> > The ATL is very AMD-specific. What does "conceptually similar" mean exactly?
> By "conceptually similar," I mean that both ARM and AMD share the same
> functional requirement: translating between a System Physical Address (SPA)
> and a device-specific address (like a DRAM address) for RAS purposes.
>
> The goal here is not to share the hardware-specific translation logic, but
> to provide a unified interface (an abstraction layer). The actual
> implementation of the translation remains entirely architecture-specific.
And why do we need an arch-overlapping unified interface?
You can just as well have aest_convert_la_to_spa() and none of that "unifying"
churn.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists