lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65c68969903b1_afa429460@dwillia2-xfh.jf.intel.com.notmuch>
Date: Fri, 9 Feb 2024 12:22:01 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Robert Richter <rrichter@....com>, Davidlohr Bueso <dave@...olabs.net>,
	Jonathan Cameron <jonathan.cameron@...wei.com>, Dave Jiang
	<dave.jiang@...el.com>, Alison Schofield <alison.schofield@...el.com>, Vishal
 Verma <vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Dan
 Williams <dan.j.williams@...el.com>
CC: Robert Richter <rrichter@....com>, Jonathan Cameron
	<Jonathan.Cameron@...wei.com>, <linux-cxl@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] cxl/pci: Fix disabling CXL memory for zero-based
 addressing

Robert Richter wrote:
> Based on CPU implementation and architecture, the CXL memory address
> decode per memory channel can be implemented as zero based address for
> addressing the CXL attached memory. In such case, the CXL host
> physical address may not match the system address. The CFMWS contains
> CXL ranges that are based on the system address range for the host
> physical address and may not match with the CXL decoders.
> 
> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> an HPA window that is described in a CFMWS structure of the CXL host
> bridge (cxl-3.1, 9.18.1.3).
> 
> Now, if the range registers are programmed with zero-based addresses,
> the ranges do not match the CFMWS windows and the CXL memory range
> will be disabled. The HDM decoder stops working then which causes
> system memory being disabled and further a kernel hang during HDM
> decoder initialization, typically when a CXL enabled kernel boots.
> 
> If the decoder is programmed with a zero-based hardware address and
> the range is enabled, the CXL memory range is then in use by the
> system.
> 
> Fix a kernel hang due to disabling of CXL memory during HDM decoder
> initialization by adding a check for zero-based address ranges, mark
> such ranges as used which prevents the CXL memory from being disabled.
> 
> Note this patch only fixes HDM initialization for zero-based address
> ranges and a kernel hang this may cause. Decoder setup still does not
> enable the HPA ranges for zero-based address ranges, the HDM decoder
> cannot be added then and the kernel shows a message like the
> following:
> 
>  cxl decoder1.0: failed to add to region: 0x0-0x3ffffffff
> 
> However, support for this can be added in a later series.
> 
> Fix for stable, please add stable tag.
> 
> Fixes: 9de321e93c3b ("cxl/pci: Refactor cxl_hdm_decode_init()")
> Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges")
> Signed-off-by: Robert Richter <rrichter@....com>
> ---
>  drivers/cxl/core/pci.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 569354a5536f..3a36a2f0c94f 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -466,6 +466,18 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>  	for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) {
>  		struct device *cxld_dev;
>  
> +		/*
> +		 * Handle zero-based hardware addresses
> +		 */
> +		if (!info->dvsec_range[i].start &&
> +		    info->dvsec_range[i].end != CXL_RESOURCE_NONE &&
> +		    info->dvsec_range[i].end) {
> +			dev_dbg(dev, "Zero-based hardware range found [%#llx - %#llx]\n",
> +				info->dvsec_range[i].start, info->dvsec_range[i].end);
> +			allowed++;
> +			continue;
> +		}
> +

I am not comfortable with this. It should be checking a platform
specific quirk, or similar for the possibility of HPA != SPA. The
entirety of the Linux CXL subsystem is built on the assumption that HPA
== SPA, and if a platform wants to inject an offset between those Linux
needs some way to enumerate that it is running in that new world. Yes,
nothing in the CXL specification precludes HPA != SPA, but Linux has
long since shipped the opposite assumption.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ