linux-kernel - Re: [PATCH v5 00/17] ARM Error Source Table V2 Support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20260109103416.GAaWDZqDYoyt3KRAE9@fat_crate.local>
Date: Fri, 9 Jan 2026 11:34:16 +0100
From: Borislav Petkov <bp@...en8.de>
To: Ruidong Tian <tianruidong@...ux.alibaba.com>
Cc: catalin.marinas@....com, will@...nel.org, lpieralisi@...nel.org,
	guohanjun@...wei.com, sudeep.holla@....com,
	xueshuai@...ux.alibaba.com, linux-kernel@...r.kernel.org,
	linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	rafael@...nel.org, lenb@...nel.org, tony.luck@...el.com,
	yazen.ghannam@....com, misono.tomohiro@...itsu.com,
	fengwei_yin@...ux.alibaba.com
Subject: Re: [PATCH v5 00/17] ARM Error Source Table V2 Support

On Mon, Jan 05, 2026 at 05:12:25PM +0800, Ruidong Tian wrote:
> > What is a "RAS node"?
> A RAS node is the hardware interface for error reporting and control,
> consisting of one or more register sets (a collection of RAS records). It is
> responsible for error logging and interrupt signaling[0].

OMG, one more meaning for the word "node". Because we're not ambiguous enough.

/facepalm

> A single hardware component can feature multiple RAS nodes. For example, a
> memory controller is treated as a "RAS device", where each memory channel
> has its own RAS node. Interrupts generated by these nodes are typically
> aggregated into a single interrupt line managed at the RAS device level.

Nomenclaturial tragedy, I'd say.

> Comparison with x86 MCA:
> 
> RAS record ≈ MCA bank.
> RAS node ≈ A set of MCA banks + CMCI on a core.
> 
> The key difference lies in uncore handling: x86 typically maps uncore errors
> (like those from a memory controller) into core-based MCA banks. In
> contrast, ARM requires uncore components to provide their own standalone RAS
> nodes. When a component requires multiple such nodes, they are grouped and
> managed as a "RAS device" in AEST driver.
> 
> [0]: https://developer.arm.com/documentation/ihi0100/latest

Yah, thanks for explaining.

> > The ATL is very AMD-specific. What does "conceptually similar" mean exactly?
> By "conceptually similar," I mean that both ARM and AMD share the same
> functional requirement: translating between a System Physical Address (SPA)
> and a device-specific address (like a DRAM address) for RAS purposes.
> 
> The goal here is not to share the hardware-specific translation logic, but
> to provide a unified interface (an abstraction layer). The actual
> implementation of the translation remains entirely architecture-specific.

And why do we need an arch-overlapping unified interface?

You can just as well have aest_convert_la_to_spa() and none of that "unifying"
churn.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette