lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 9 Feb 2024 15:44:09 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Dan Williams <dan.j.williams@...el.com>, Robert Richter
	<rrichter@....com>, Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron
	<jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>, "Alison
 Schofield" <alison.schofield@...el.com>, Vishal Verma
	<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>
CC: Robert Richter <rrichter@....com>, Jonathan Cameron
	<Jonathan.Cameron@...wei.com>, <linux-cxl@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] cxl/pci: Fix disabling CXL memory for zero-based
 addressing

Dan Williams wrote:
> Robert Richter wrote:
> > Based on CPU implementation and architecture, the CXL memory address
> > decode per memory channel can be implemented as zero based address for
> > addressing the CXL attached memory. In such case, the CXL host
> > physical address may not match the system address. The CFMWS contains
> > CXL ranges that are based on the system address range for the host
> > physical address and may not match with the CXL decoders.
> > 
> > During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> > 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> > an HPA window that is described in a CFMWS structure of the CXL host
> > bridge (cxl-3.1, 9.18.1.3).
> > 
> > Now, if the range registers are programmed with zero-based addresses,
> > the ranges do not match the CFMWS windows and the CXL memory range
> > will be disabled. The HDM decoder stops working then which causes
> > system memory being disabled and further a kernel hang during HDM
> > decoder initialization, typically when a CXL enabled kernel boots.
> > 
> > If the decoder is programmed with a zero-based hardware address and
> > the range is enabled, the CXL memory range is then in use by the
> > system.
> > 
> > Fix a kernel hang due to disabling of CXL memory during HDM decoder
> > initialization by adding a check for zero-based address ranges, mark
> > such ranges as used which prevents the CXL memory from being disabled.
> > 
> > Note this patch only fixes HDM initialization for zero-based address
> > ranges and a kernel hang this may cause. Decoder setup still does not
> > enable the HPA ranges for zero-based address ranges, the HDM decoder
> > cannot be added then and the kernel shows a message like the
> > following:
> > 
> >  cxl decoder1.0: failed to add to region: 0x0-0x3ffffffff
> > 
> > However, support for this can be added in a later series.
> > 
> > Fix for stable, please add stable tag.
> > 
> > Fixes: 9de321e93c3b ("cxl/pci: Refactor cxl_hdm_decode_init()")
> > Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges")
> > Signed-off-by: Robert Richter <rrichter@....com>
> > ---
> >  drivers/cxl/core/pci.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index 569354a5536f..3a36a2f0c94f 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -466,6 +466,18 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> >  	for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) {
> >  		struct device *cxld_dev;
> >  
> > +		/*
> > +		 * Handle zero-based hardware addresses
> > +		 */
> > +		if (!info->dvsec_range[i].start &&
> > +		    info->dvsec_range[i].end != CXL_RESOURCE_NONE &&
> > +		    info->dvsec_range[i].end) {
> > +			dev_dbg(dev, "Zero-based hardware range found [%#llx - %#llx]\n",
> > +				info->dvsec_range[i].start, info->dvsec_range[i].end);
> > +			allowed++;
> > +			continue;
> > +		}
> > +
> 
> I am not comfortable with this. It should be checking a platform
> specific quirk, or similar for the possibility of HPA != SPA. The
> entirety of the Linux CXL subsystem is built on the assumption that HPA
> == SPA, and if a platform wants to inject an offset between those Linux
> needs some way to enumerate that it is running in that new world. Yes,
> nothing in the CXL specification precludes HPA != SPA, but Linux has
> long since shipped the opposite assumption.

In other words, this nothing to do with "zero-based addressing", this is
about a missing platform translation that the CXL subsystem needs to
apply to the values it reads from hardware to correctly map them into
the HPA == SPA expectations of the Linux implementation. Something like
this (untested, not even compiled) where cxl_acpi picks up a quirk from
somewhere to identify the platform and replace default_cxl_xlat_hpa()
with what is needed:

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index dcf2b39e1048..8543c9230484 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -312,8 +312,18 @@ static int cxl_acpi_qos_class(struct cxl_root *cxl_root,
 	return cxl_acpi_evaluate_qtg_dsm(handle, coord, entries, qos_class);
 }
 
+static resource_size_t default_xlat_hpa(struct pci_dev *pdev, resource_size_t dev_hpa)
+{
+	/*
+	 * default expectation is that device HPA values match host
+	 * physical address resource values
+	 */
+	return hpa;
+}
+
 static const struct cxl_root_ops acpi_root_ops = {
 	.qos_class = cxl_acpi_qos_class,
+	.xlat_hpa = default_cxl_xlat_hpa,
 };
 
 static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 6c9c8d92f8f7..a945a0eccd6a 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -322,7 +322,7 @@ static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm)
 	return devm_add_action_or_reset(host, disable_hdm, cxlhdm);
 }
 
-int cxl_dvsec_rr_decode(struct device *dev, int d,
+int cxl_dvsec_rr_decode(struct cxl_root *cxl_root, struct device *dev, int d,
 			struct cxl_endpoint_dvsec_info *info)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
@@ -411,6 +411,7 @@ int cxl_dvsec_rr_decode(struct device *dev, int d,
 
 		base |= temp & CXL_DVSEC_MEM_BASE_LOW_MASK;
 
+		base = cxl_root->ops->xlat_hpa(pdev, base);
 		info->dvsec_range[i] = (struct range) {
 			.start = base,
 			.end = base + size - 1
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 97c21566677a..83fe7018d243 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -91,6 +91,7 @@ static int cxl_switch_port_probe(struct cxl_port *port)
 
 static int cxl_endpoint_port_probe(struct cxl_port *port)
 {
+	struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
 	struct cxl_endpoint_dvsec_info info = { .port = port };
 	struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
@@ -98,7 +99,7 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 	struct cxl_port *root;
 	int rc;
 
-	rc = cxl_dvsec_rr_decode(cxlds->dev, cxlds->cxl_dvsec, &info);
+	rc = cxl_dvsec_rr_decode(cxl_root, cxlds->dev, cxlds->cxl_dvsec, &info);
 	if (rc < 0)
 		return rc;
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ