lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251230202211.GAaVQ0cx8o-CqzGU2O@fat_crate.local>
Date: Tue, 30 Dec 2025 21:22:11 +0100
From: Borislav Petkov <bp@...en8.de>
To: Ruidong Tian <tianruidong@...ux.alibaba.com>
Cc: catalin.marinas@....com, will@...nel.org, lpieralisi@...nel.org,
	guohanjun@...wei.com, sudeep.holla@....com,
	xueshuai@...ux.alibaba.com, linux-kernel@...r.kernel.org,
	linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	rafael@...nel.org, lenb@...nel.org, tony.luck@...el.com,
	yazen.ghannam@....com, misono.tomohiro@...itsu.com,
	fengwei_yin@...ux.alibaba.com
Subject: Re: [PATCH v5 00/17] ARM Error Source Table V2 Support

Some high-level notes first:

On Tue, Dec 30, 2025 at 05:09:28PM +0800, Ruidong Tian wrote:
> This series introduces support for the ARM Error Source Table (AEST), aligning
> with version 2.0 of ACPI for Armv8 RAS Extensions [0].

I'd like to hear from ARM folks what the strategy for this thing is...

> AEST provides a critical mechanism for hardware to directly notify the
> operating system kernel about RAS errors via interrupts, a concept known as
> Kernel-first error handling. Compared to firmware-first error handling
> (e.g., GHES), AEST offers a more lightweight approach. This efficiency allows
> the OS to potentially report every Corrected Error (CE), enabling upper-layer
> applications to leverage CE information for error prediction[1][2].
> 
> This series is based on Tyler Baicar's preliminary patches [3], which have not
> yet been sent to the mailing list as v2.

I guess I'll wait for those first.

> AEST Driver Architecture
> ========================
> 
> The AEST driver is structured into three primary components:
>   - AEST device: Responsible for handling interrupts, managing the lifecycle
>                  of AEST nodes, and processing error records.
>   - AEST node: Corresponds directly to a RAS node in the hardware

What is a "RAS node"?

>   - AEST record: Represents a set of RAS registers associated with a specific
>                  error source.

...

> Address Translation
> ===================
> 
> As described in section 2.2 [0], error addresses reported by AEST records
> may be "node-specific Logical Addresses" rather than the "System Physical
> Addresses" (SPA) used by the kernel. Therefore, the driver needs to translate
> these Logical Addresses (LA) to SPA. This translation mechanism is conceptually
> similar to AMD's Address Translation Logic (ATL) [4], leading patch 0014 to
> introduce a common translation function for both AMD and ARM architectures.

Say what now? 

The ATL is very AMD-specific. What does "conceptually similar" mean exactly?
What happens if we have to change the ATL and break your use case in the
process?

What exact functionality from the ATL do you really need here?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ