lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aac45d58-afca-487c-8d14-62d5e7fd490e@fujitsu.com>
Date: Fri, 13 Jun 2025 02:12:37 +0000
From: "Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>
To: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
	"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>
CC: Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron
	<jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>, Alison
 Schofield <alison.schofield@...el.com>, Vishal Verma
	<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Dan Williams
	<dan.j.williams@...el.com>, Matthew Wilcox <willy@...radead.org>, Jan Kara
	<jack@...e.cz>, "Rafael J . Wysocki" <rafael@...nel.org>, Len Brown
	<len.brown@...el.com>, Pavel Machek <pavel@...nel.org>, Li Ming
	<ming.li@...omail.com>, Jeff Johnson <jeff.johnson@....qualcomm.com>, Ying
 Huang <huang.ying.caritas@...il.com>, "Xingtao Yao (Fujitsu)"
	<yaoxt.fnst@...itsu.com>, Peter Zijlstra <peterz@...radead.org>, Greg KH
	<gregkh@...uxfoundation.org>, Nathan Fontenot <nathan.fontenot@....com>,
	Terry Bowman <terry.bowman@....com>, Robert Richter <rrichter@....com>,
	Benjamin Cheatham <benjamin.cheatham@....com>, PradeepVineshReddy Kodamati
	<PradeepVineshReddy.Kodamati@....com>
Subject: Re: [PATCH v4 7/7] cxl/dax: Defer DAX consumption of SOFT RESERVED
 resources until after CXL region creation

Hi Smita, Nathan, Terry

I am struggling to understand if this patch is truly necessary, or if I haven't
fully grasped the scenario where it provides value. Without applying this patch
on a QEMU/VM with both HMEM and CXL.mem installed, I observed no issues. (Are there
specific config options required to reproduce the problem?)

Here is the /proc/iomem without the patch:
180000000-1ffffffff : Soft Reserved  ### 2 hmem nodes
   180000000-1bfffffff : dax1.0
     180000000-1bfffffff : System RAM (kmem)
   1c0000000-1ffffffff : dax2.0
     1c0000000-1ffffffff : System RAM (kmem)
5c0001128-5c00011b7 : port1
5d0000000-64fffffff : CXL Window 0  ### 1 CXL node
   5d0000000-64fffffff : region0
     5d0000000-64fffffff : dax0.0
       5d0000000-64fffffff : System RAM (kmem)

On 04/06/2025 06:19, Smita Koralahalli wrote:
> From: Nathan Fontenot <nathan.fontenot@....com>
> 
> The DAX HMEM driver currently consumes all SOFT RESERVED iomem resources
> during initialization. This interferes with the CXL driver’s ability to
> create regions and trim overlapping SOFT RESERVED ranges before DAX uses
> them.

When referring to "HMEM driver" in the commit message, is it
`dax_hmem_platform_driver` or `dax_hmem_driver`? Regardless of which,
what is the impact if one consumes all SOFT RESERVED resources?

Since `hmem_register_device()` only creates HMEM devices for ranges
*without* `IORES_DESC_CXL` which could be marked in cxl_acpi , cxl_core/cxl_dax
should still create regions and DAX devices without conflicts.

> To resolve this, defer the DAX driver's resource consumption if the
> cxl_acpi driver is enabled. The DAX HMEM initialization skips walking the
> iomem resource tree in this case. After CXL region creation completes,
> any remaining SOFT RESERVED resources are explicitly registered with the
> DAX driver by the CXL driver.

Conversely, with this patch applied, `cxl_region_softreserv_update()` attempts
to register new HMEM devices. This may cause duplicate registrations for the
  same range (e.g., 0x180000000-0x1ffffffff), triggering warnings like:

[   14.984108] kmem dax4.0: mapping0: 0x180000000-0x1ffffffff could not reserve region
[   14.987204] kmem dax4.0: probe with driver kmem failed with error -16

Because the HMAT initialization already registered these sub-ranges:
   180000000-1bfffffff
   1c0000000-1ffffffff


If I'm missing something, please correct me.

Thanks,
Zhijian



> 
> This sequencing ensures proper handling of overlaps and fixes hotplug
> failures.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@....com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@....com>
> Co-developed-by: Terry Bowman <terry.bowman@....com>
> Signed-off-by: Terry Bowman <terry.bowman@....com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
> ---
>   drivers/cxl/core/region.c | 10 +++++++++
>   drivers/dax/hmem/device.c | 43 ++++++++++++++++++++-------------------
>   drivers/dax/hmem/hmem.c   |  3 ++-
>   include/linux/dax.h       |  6 ++++++
>   4 files changed, 40 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3a5ca44d65f3..c6c0c7ba3b20 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -10,6 +10,7 @@
>   #include <linux/sort.h>
>   #include <linux/idr.h>
>   #include <linux/memory-tiers.h>
> +#include <linux/dax.h>
>   #include <cxlmem.h>
>   #include <cxl.h>
>   #include "core.h"
> @@ -3553,6 +3554,11 @@ static struct resource *normalize_resource(struct resource *res)
>   	return NULL;
>   }
>   
> +static int cxl_softreserv_mem_register(struct resource *res, void *unused)
> +{
> +	return hmem_register_device(phys_to_target_node(res->start), res);
> +}
> +
>   static int __cxl_region_softreserv_update(struct resource *soft,
>   					  void *_cxlr)
>   {
> @@ -3590,6 +3596,10 @@ int cxl_region_softreserv_update(void)
>   				    __cxl_region_softreserv_update);
>   	}
>   
> +	/* Now register any remaining SOFT RESERVES with DAX */
> +	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM,
> +			    0, -1, NULL, cxl_softreserv_mem_register);
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_region_softreserv_update, "CXL");
> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
> index 59ad44761191..cc1ed7bbdb1a 100644
> --- a/drivers/dax/hmem/device.c
> +++ b/drivers/dax/hmem/device.c
> @@ -8,7 +8,6 @@
>   static bool nohmem;
>   module_param_named(disable, nohmem, bool, 0444);
>   
> -static bool platform_initialized;
>   static DEFINE_MUTEX(hmem_resource_lock);
>   static struct resource hmem_active = {
>   	.name = "HMEM devices",
> @@ -35,9 +34,7 @@ EXPORT_SYMBOL_GPL(walk_hmem_resources);
>   
>   static void __hmem_register_resource(int target_nid, struct resource *res)
>   {
> -	struct platform_device *pdev;
>   	struct resource *new;
> -	int rc;
>   
>   	new = __request_region(&hmem_active, res->start, resource_size(res), "",
>   			       0);
> @@ -47,21 +44,6 @@ static void __hmem_register_resource(int target_nid, struct resource *res)
>   	}
>   
>   	new->desc = target_nid;
> -
> -	if (platform_initialized)
> -		return;
> -
> -	pdev = platform_device_alloc("hmem_platform", 0);
> -	if (!pdev) {
> -		pr_err_once("failed to register device-dax hmem_platform device\n");
> -		return;
> -	}
> -
> -	rc = platform_device_add(pdev);
> -	if (rc)
> -		platform_device_put(pdev);
> -	else
> -		platform_initialized = true;
>   }
>   
>   void hmem_register_resource(int target_nid, struct resource *res)
> @@ -83,9 +65,28 @@ static __init int hmem_register_one(struct resource *res, void *data)
>   
>   static __init int hmem_init(void)
>   {
> -	walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> -			IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
> -	return 0;
> +	struct platform_device *pdev;
> +	int rc;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_ACPI)) {
> +		walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
> +				    IORESOURCE_MEM, 0, -1, NULL,
> +				    hmem_register_one);
> +	}
> +
> +	pdev = platform_device_alloc("hmem_platform", 0);
> +	if (!pdev) {
> +		pr_err("failed to register device-dax hmem_platform device\n");
> +		return -1;
> +	}
> +
> +	rc = platform_device_add(pdev);
> +	if (rc) {
> +		pr_err("failed to add device-dax hmem_platform device\n");
> +		platform_device_put(pdev);
> +	}
> +
> +	return rc;
>   }
>   
>   /*
> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
> index 3aedef5f1be1..a206b9b383e4 100644
> --- a/drivers/dax/hmem/hmem.c
> +++ b/drivers/dax/hmem/hmem.c
> @@ -61,7 +61,7 @@ static void release_hmem(void *pdev)
>   	platform_device_unregister(pdev);
>   }
>   
> -static int hmem_register_device(int target_nid, const struct resource *res)
> +int hmem_register_device(int target_nid, const struct resource *res)
>   {
>   	struct device *host = &dax_hmem_pdev->dev;
>   	struct platform_device *pdev;
> @@ -124,6 +124,7 @@ static int hmem_register_device(int target_nid, const struct resource *res)
>   	platform_device_put(pdev);
>   	return rc;
>   }
> +EXPORT_SYMBOL_GPL(hmem_register_device);
>   
>   static int dax_hmem_platform_probe(struct platform_device *pdev)
>   {
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index a4ad3708ea35..5052dca8b3bc 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -299,10 +299,16 @@ static inline int dax_mem2blk_err(int err)
>   
>   #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
>   void hmem_register_resource(int target_nid, struct resource *r);
> +int hmem_register_device(int target_nid, const struct resource *res);
>   #else
>   static inline void hmem_register_resource(int target_nid, struct resource *r)
>   {
>   }
> +
> +static inline int hmem_register_device(int target_nid, const struct resource *res)
> +{
> +	return 0;
> +}
>   #endif
>   
>   typedef int (*walk_hmem_fn)(int target_nid, const struct resource *res);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ