[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMfPe6HY3Xb7Yl6l@rric.localdomain>
Date: Mon, 15 Sep 2025 10:34:03 +0200
From: Robert Richter <rrichter@....com>
To: Dave Jiang <dave.jiang@...el.com>
Cc: Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ira Weiny <ira.weiny@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Davidlohr Bueso <dave@...olabs.net>, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org, Gregory Price <gourry@...rry.net>,
"Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>,
Terry Bowman <terry.bowman@....com>,
Joshua Hahn <joshua.hahnjy@...il.com>
Subject: Re: [PATCH v3 11/11] cxl: Enable AMD Zen5 address translation using
ACPI PRMT
On 12.09.25 16:46:23, Dave Jiang wrote:
>
>
> On 9/12/25 7:45 AM, Robert Richter wrote:
> > Add AMD Zen5 support for address translation.
> >
> > Zen5 systems may be configured to use 'Normalized addresses'. Then,
> > CXL endpoints use their own physical address space and are programmed
> > passthrough (DPA == HPA), the number of interleaving ways for the
> > endpoint is set to one. The Host Physical Addresses (HPAs) need to be
> > translated from the endpoint to its CXL host bridge. The HPA of a CXL
> > host bridge is equivalent to the System Physical Address (SPA).
> >
> > ACPI Platform Runtime Mechanism (PRM) is used to translate the CXL
> > Device Physical Address (DPA) to its System Physical Address. This is
> > documented in:
> >
> > AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
> > ACPI v6.5 Porting Guide, Publication # 58088
> > https://www.amd.com/en/search/documentation/hub.html
> >
> > To implement AMD Zen5 address translation the following steps are
> > needed:
> >
> > AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware
> > call (Address Translation - CXL DPA to System Physical Address, see
> > ACPI v6.5 Porting Guide above) when address translation is enabled.
> > The existence of the callback can be identified using a specific GUID
> > as documented. The initialization code checks firmware and kernel
> > support of ACPI PRM.
> >
> > Introduce a new file core/atl.c to handle ACPI PRM specific address
> > translation code. Naming is loosely related to the kernel's AMD
> > Address Translation Library (CONFIG_AMD_ATL) but implementation does
> > not dependent on it, nor it is vendor specific. Use Kbuild and Kconfig
> > options respectively to enable the code depending on architecture and
> > platform options.
> >
> > Implement an ACPI PRM firmware call for CXL address translation in the
> > new function cxl_prm_to_hpa(). This includes sanity checks. Enable the
> > callback for applicable CXL host bridges using the new cxl_atl_init()
> > function.
> >
> > Signed-off-by: Robert Richter <rrichter@....com>
>
> I'm still trying to digest the series. Couple things below.
Thank you for review, I appreciate that.
>
> > ---
> > drivers/cxl/Kconfig | 4 ++
> > drivers/cxl/core/Makefile | 1 +
> > drivers/cxl/core/atl.c | 138 ++++++++++++++++++++++++++++++++++++++
> > drivers/cxl/core/core.h | 1 +
> > drivers/cxl/core/port.c | 8 +++
> > 5 files changed, 152 insertions(+)
> > create mode 100644 drivers/cxl/core/atl.c
> >
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index 48b7314afdb8..31f9c96ef908 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -233,4 +233,8 @@ config CXL_MCE
> > def_bool y
> > depends on X86_MCE && MEMORY_FAILURE
> >
> > +config CXL_ATL
> > + def_bool y
> > + depends on ACPI_PRMT
> > +
> > endif
> > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > index 5ad8fef210b5..11fe272a6e29 100644
> > --- a/drivers/cxl/core/Makefile
> > +++ b/drivers/cxl/core/Makefile
> > @@ -20,3 +20,4 @@ cxl_core-$(CONFIG_CXL_REGION) += region.o
> > cxl_core-$(CONFIG_CXL_MCE) += mce.o
> > cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> > cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
> > +cxl_core-$(CONFIG_CXL_ATL) += atl.o
> > diff --git a/drivers/cxl/core/atl.c b/drivers/cxl/core/atl.c
> > new file mode 100644
> > index 000000000000..5fc21eddaade
> > --- /dev/null
> > +++ b/drivers/cxl/core/atl.c
> > @@ -0,0 +1,138 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2025 Advanced Micro Devices, Inc.
> > + */
> > +
> > +#include <linux/prmt.h>
> > +#include <linux/pci.h>
> > +
> > +#include <cxlmem.h>
> > +#include "core.h"
> > +
> > +static bool check_prm_address_translation(struct cxl_port *port)
> > +{
> > + /* Applies to CXL host bridges only */
> > + return !is_cxl_root(port) && port->host_bridge &&
> > + is_cxl_root(to_cxl_port(port->dev.parent));
> > +}
> > +
> > +/*
> > + * PRM Address Translation - CXL DPA to System Physical Address
> > + *
> > + * Reference:
> > + *
> > + * AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
> > + * ACPI v6.5 Porting Guide, Publication # 58088
> > + */
> > +
> > +static const guid_t prm_cxl_dpa_spa_guid =
> > + GUID_INIT(0xee41b397, 0x25d4, 0x452c, 0xad, 0x54, 0x48, 0xc6, 0xe3,
> > + 0x48, 0x0b, 0x94);
> > +
> > +struct prm_cxl_dpa_spa_data {
> > + u64 dpa;
> > + u8 reserved;
> > + u8 devfn;
> > + u8 bus;
> > + u8 segment;
> > + void *out;
> > +} __packed;
> > +
> > +static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
> > +{
> > + struct prm_cxl_dpa_spa_data data;
> > + u64 spa;
> > + int rc;
> > +
> > + data = (struct prm_cxl_dpa_spa_data) {
> > + .dpa = dpa,
> > + .devfn = pci_dev->devfn,
> > + .bus = pci_dev->bus->number,
> > + .segment = pci_domain_nr(pci_dev->bus),
> > + .out = &spa,
> > + };
> > +
> > + rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> > + if (rc) {
> > + pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
> > + return ULLONG_MAX;
> > + }
> > +
> > + pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
> > +
> > + return spa;
> > +}
> > +
> > +static u64 cxl_prm_to_hpa(struct cxl_decoder *cxld, u64 hpa)
> > +{
> > + struct cxl_memdev *cxlmd;
> > + struct pci_dev *pci_dev;
> > + struct cxl_port *port;
> > + struct cxl_endpoint_decoder *cxled;
> > +
> > + /* Only translate from endpoint to its parent port. */
> > + if (!is_endpoint_decoder(&cxld->dev))
> > + return hpa;
> > +
> > + cxled = to_cxl_endpoint_decoder(&cxld->dev);
> > +
> > + /*
> > + * Nothing to do if base is non-zero and Normalized Addressing
> > + * is disabled.
> > + */
>
> Not sure if this comment matches the code below
Non-zero is wrong, I meant it must be zero-based (means DPA == HPA).
Otherwise, if not zero-based, Normalized Addressing is disabled and
translation is not needed.
> > + if (cxld->hpa_range.start != cxled->dpa_res->start)
> > + return hpa;
> > +
> > + /*
> > + * Endpoints are programmed passthrough in Normalized
> > + * Addressing mode.
> > + */
> Not sure if the comment here matches the conditional check.
Passthrough implies interleaving is disabled and thus ways must be 1,
will update comment.
> > + if (cxld->interleave_ways != 1) {
> > + dev_dbg(&cxld->dev, "unexpected interleaving config: ways: %d granularity: %d\n",
> > + cxld->interleave_ways, cxld->interleave_granularity);
> > + return ULLONG_MAX;
> > + }
> > +
> > + if (hpa < cxld->hpa_range.start || hpa > cxld->hpa_range.end) {
> > + dev_dbg(&cxld->dev, "hpa addr %#llx out of range %#llx-%#llx\n",
>
> Suggest use %pr for range printing to avoid 0-day complaints on 32bit compilers.
Ok.
>
> > + hpa, cxld->hpa_range.start, cxld->hpa_range.end);
> > + return ULLONG_MAX;
> > + }
> > +
> > + port = to_cxl_port(cxld->dev.parent);
> > + cxlmd = port ? to_cxl_memdev(port->uport_dev) : NULL;
> > + if (!port || !dev_is_pci(cxlmd->dev.parent)) {
> > + dev_dbg(&cxld->dev, "No endpoint found: %s, range %#llx-%#llx\n",
> > + dev_name(cxld->dev.parent), cxld->hpa_range.start,
> > + cxld->hpa_range.end);
> > + return ULLONG_MAX;
> > + }
> > + pci_dev = to_pci_dev(cxlmd->dev.parent);
> > +
> > + return prm_cxl_dpa_spa(pci_dev, hpa);
> > +}
> > +
> > +static void cxl_prm_init(struct cxl_port *port)
> > +{
> > + u64 spa;
> > + struct prm_cxl_dpa_spa_data data = { .out = &spa, };
> > + int rc;
> > +
> > + if (!check_prm_address_translation(port))
> > + return;
> > +
> > + /* Check kernel (-EOPNOTSUPP) and firmware support (-ENODEV) */
> > + rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> > + if (rc == -EOPNOTSUPP || rc == -ENODEV)
> > + return;
> > +
> > + port->to_hpa = cxl_prm_to_hpa;
> > +
> > + dev_dbg(port->host_bridge, "PRM address translation enabled for %s.\n",
> > + dev_name(&port->dev));
> > +}
> > +
> > +void cxl_atl_init(struct cxl_port *port)
> > +{
> > + cxl_prm_init(port);
> > +}
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index eac8cc1bdaa0..624e438d052a 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -150,6 +150,7 @@ int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
> > int cxl_ras_init(void);
> > void cxl_ras_exit(void);
> > int cxl_gpf_port_setup(struct cxl_dport *dport);
> > +void cxl_atl_init(struct cxl_port *port);
> >
> > #ifdef CONFIG_CXL_FEATURES
> > struct cxl_feat_entry *
> > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > index 8f36ff413f5d..8007e002888e 100644
> > --- a/drivers/cxl/core/port.c
> > +++ b/drivers/cxl/core/port.c
> > @@ -831,6 +831,12 @@ static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
> > &cxl_einj_inject_fops);
> > }
> >
> > +static void setup_address_translation(struct cxl_port *port)
> > +{
> > + if (IS_ENABLED(CONFIG_CXL_ATL))
> > + cxl_atl_init(port);
> > +}
> > +
> > static int cxl_port_add(struct cxl_port *port,
> > resource_size_t component_reg_phys,
> > struct cxl_dport *parent_dport)
> > @@ -868,6 +874,8 @@ static int cxl_port_add(struct cxl_port *port,
> > return rc;
> > }
> >
> > + setup_address_translation(port);
>
> Given that the address translation callback only is needed for the
> host bridge, should this be called from acpi_probe() when the host
> bridge is being setup rather than going through every port add and
> checking if the port is a host bridge?
Will check if that is feasible.
Thanks,
-Robert
> DJ
>
> > +
> > rc = device_add(dev);
> > if (rc)
> > return rc;
>
Powered by blists - more mailing lists