lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <69701f6de978_1d6f1001e@dwillia2-mobl4.notmuch>
Date: Tue, 20 Jan 2026 16:35:57 -0800
From: <dan.j.williams@...el.com>
To: Yazen Ghannam <yazen.ghannam@....com>, Robert Richter <rrichter@....com>
CC: Peter Zijlstra <peterz@...radead.org>, Dan Williams
	<dan.j.williams@...el.com>, Dave Jiang <dave.jiang@...el.com>, Ard Biesheuvel
	<ardb@...nel.org>, Jonathan Cameron <jonathan.cameron@...wei.com>, "Alison
 Schofield" <alison.schofield@...el.com>, Vishal Verma
	<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Davidlohr Bueso
	<dave@...olabs.net>, <linux-cxl@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, Gregory Price <gourry@...rry.net>, "Fabio M.
 De Francesco" <fabio.m.de.francesco@...ux.intel.com>, Terry Bowman
	<terry.bowman@....com>, Joshua Hahn <joshua.hahnjy@...il.com>, "Borislav
 Petkov" <bp@...en8.de>, "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	"John Allen" <john.allen@....com>
Subject: Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using
 ACPI PRMT

Yazen Ghannam wrote:
[..]
> Additionally, the same translation code can be used in multiple places
> (tools, FW, kernel, etc.). Most consumers treat the code like a library
> that they include. It's coded once and bugs can be fixed in one place.
> 
> However, with a native kernel driver, we have to re-write everything to
> match coding style, licensing, etc.
> 
> Also, new hardware may need changes to the code (sometimes major). So
> there's upstream work, backporting (more testing), and so on.
> 
> See the AMD Address Translation Library at drivers/ras/amd/atl/.

There is more nuance here.

There are indeed cases where there are high degrees of non-architectural
details in flux from one product to the next. For example, the details
that EDAC no longer needs to chase because the ADXL DSM exists are a
solution to the problem of shifting and complicated memory topology
details.

CXL is a standard that this architecture at issue decided to inject
software-model-destroying artificats like CXL-endpoint-HPA to
CXL-Host-Bridge-SPA (Normalized Addressing) translation.

A Normalized Address looks like a static offset per host bridge, not a
method call round trip to a runtime firmware service.

Note that there are other platforms that break basic HPA-to-SPA
assumptions, but those have been handled with native driver support via
XOR interleave, and non-CXL-Host-Bridge target updates to the
ACPI.CEDT.CFMWS table.

> > > Worse, you might have to deal with various incompatible buggy PRM
> > > versions because BIOS :/
> > 
> > The address translation functions are straight forward. I haven't
> > experienced any issues here. If there would be any, this will be
> > solvable, e.g. by requiring a specific minimum version or uuid to run
> > PRM.
> > 
> 
> This is a good point, and I've brought this up with some of my
> colleagues.

The more that software bugs leak into this interface requiring
consideration of versions and the like, the louder the requests for
"please move this to a driver" will become.

> The PRM methods are supposed to be able to be updated at runtime by the
> OS. We could think of this as a similar flow to microcode.

No, at the point where runtime updates are needed outside of a BIOS
update we have crossed the threshold into Linux actively taking on new
maintenance burden to enable hardware platforms to avoid the discipline
of architectural solutions.

Microcode is a confined solution space. PRM is unbounded.

Now, stepping back, this specific Zen5 support has been a long time
coming. Specifically, there are shipping platforms where Linux is unable
to use any of its CXL RAS support because it gets tripped up on this
fundamental step. I would like to see exact details on what this PRM
handler is doing so that we, linux-cxl community, can make a
determination about:

    "yes this algorithm is so tiny and static, PRM not indicated"

    "no, this is complicated and guaranteed to keep shifting product to
     product, Linux is better off with a PRM helper"

...but still merge this PRM call, regardless of the determination. Put
the next potential use of PRM on notice that native drivers are required
outside of meeting the "complicated + shifting" criteria that indicate
PRM.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ