[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aS3y0j96t1ygwJsR@aschofie-mobl2.lan>
Date: Mon, 1 Dec 2025 11:56:02 -0800
From: Alison Schofield <alison.schofield@...el.com>
To: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
CC: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<nvdimm@...ts.linux.dev>, <linux-fsdevel@...r.kernel.org>,
<linux-pm@...r.kernel.org>, Vishal Verma <vishal.l.verma@...el.com>, "Ira
Weiny" <ira.weiny@...el.com>, Dan Williams <dan.j.williams@...el.com>,
Jonathan Cameron <jonathan.cameron@...wei.com>, Yazen Ghannam
<yazen.ghannam@....com>, Dave Jiang <dave.jiang@...el.com>, Davidlohr Bueso
<dave@...olabs.net>, Matthew Wilcox <willy@...radead.org>, Jan Kara
<jack@...e.cz>, "Rafael J . Wysocki" <rafael@...nel.org>, Len Brown
<len.brown@...el.com>, Pavel Machek <pavel@...nel.org>, Li Ming
<ming.li@...omail.com>, Jeff Johnson <jeff.johnson@....qualcomm.com>, "Ying
Huang" <huang.ying.caritas@...il.com>, Yao Xingtao <yaoxt.fnst@...itsu.com>,
Peter Zijlstra <peterz@...radead.org>, Greg KH <gregkh@...uxfoundation.org>,
Nathan Fontenot <nathan.fontenot@....com>, Terry Bowman
<terry.bowman@....com>, Robert Richter <rrichter@....com>, Benjamin Cheatham
<benjamin.cheatham@....com>, Zhijian Li <lizhijian@...itsu.com>, "Borislav
Petkov" <bp@...en8.de>, Ard Biesheuvel <ardb@...nel.org>
Subject: Re: [PATCH v4 0/9] dax/hmem, cxl: Coordinate Soft Reserved handling
with CXL and HMEM
On Thu, Nov 20, 2025 at 03:19:16AM +0000, Smita Koralahalli wrote:
> This series aims to address long-standing conflicts between HMEM and
> CXL when handling Soft Reserved memory ranges.
>
> Reworked from Dan's patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/patch/?id=ab70c6227ee6165a562c215d9dcb4a1c55620d5d
>
> Previous work:
> https://lore.kernel.org/all/20250715180407.47426-1-Smita.KoralahalliChannabasappa@amd.com/
>
> Link to v3:
> https://lore.kernel.org/all/20250930044757.214798-1-Smita.KoralahalliChannabasappa@amd.com
>
> This series should be applied on top of:
> "214291cbaace: acpi/hmat: Fix lockdep warning for hmem_register_resource()"
> and is based on:
> base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
>
> I initially tried picking up the three probe ordering patches from v20/v21
> of Type 2 support, but I hit a NULL pointer dereference in
> devm_cxl_add_memdev() and cycle dependency with all patches so I left
> them out for now. With my current series rebased on 6.18-rc2 plus
> 214291cbaace, probe ordering behaves correctly on AMD systems and I have
> verified the scenarios mentioned below. I can pull those three patches
> back in for a future revision once the failures are sorted out.
Hi Smita,
This is a regression from the v3 version for my hotplug test case.
I believe at least partially due to the ommitted probe order patches.
I'm not clear why that 'dax18.0' still exists after region teardown.
Upon booting:
- Do not expect to see that Soft Reserved resource
68e80000000-8d37fffffff : CXL Window 9
68e80000000-70e7fffffff : region9
68e80000000-70e7fffffff : Soft Reserved
68e80000000-70e7fffffff : dax18.0
68e80000000-70e7fffffff : System RAM (kmem)
After region teardown:
- Do not expect to see that Soft Reserved resource
- Do not expect to see that DAX or kmem
68e80000000-8d37fffffff : CXL Window 9
68e80000000-70e7fffffff : Soft Reserved
68e80000000-70e7fffffff : dax18.0
68e80000000-70e7fffffff : System RAM (kmem)
Create the region anew:
- Here we see a new region and dax devices created in the
available space after the Soft Reserved. We don't want
that. We want to be able to recreate in that original
space of 68e80000000-70e7fffffff.
68e80000000-8d37fffffff : CXL Window 9
68e80000000-70e7fffffff : Soft Reserved
68e80000000-70e7fffffff : dax18.0
68e80000000-70e7fffffff : System RAM (kmem)
70e80000000-78e7fffffff : region9
70e80000000-78e7fffffff : dax9.0
70e80000000-78e7fffffff : System RAM (kmem)
-- Alison
>
> Probe order patches of interest:
> cxl/mem: refactor memdev allocation
> cxl/mem: Arrange for always-synchronous memdev attach
> cxl/port: Arrange for always synchronous endpoint attach
>
> [1] Hotplug looks okay. After offlining the memory I can tear down the
> regions and recreate it back if CXL owns entire SR range as Soft Reserved
> is gone. dax_cxl creates dax devices and onlines memory.
> 850000000-284fffffff : CXL Window 0
> 850000000-284fffffff : region0
> 850000000-284fffffff : dax0.0
> 850000000-284fffffff : System RAM (kmem)
>
> [2] With CONFIG_CXL_REGION disabled, all the resources are handled by
> HMEM. Soft Reserved range shows up in /proc/iomem, no regions come up
> and dax devices are created from HMEM.
> 850000000-284fffffff : CXL Window 0
> 850000000-284fffffff : Soft Reserved
> 850000000-284fffffff : dax0.0
> 850000000-284fffffff : System RAM (kmem)
>
> [3] Region assembly failures also behave okay and work same as [2].
>
> Before:
> 2850000000-484fffffff : Soft Reserved
> 2850000000-484fffffff : CXL Window 1
> 2850000000-484fffffff : dax4.0
> 2850000000-484fffffff : System RAM (kmem)
>
> After tearing down dax4.0 and creating it back:
>
> Logs:
> [ 547.847764] unregister_dax_mapping: mapping0: unregister_dax_mapping
> [ 547.855000] trim_dev_dax_range: dax dax4.0: delete range[0]: 0x2850000000:0x484fffffff
> [ 622.474580] alloc_dev_dax_range: dax dax4.1: alloc range[0]: 0x0000002850000000:0x000000484fffffff
> [ 752.766194] Fallback order for Node 0: 0 1
> [ 752.766199] Fallback order for Node 1: 1 0
> [ 752.766200] Built 2 zonelists, mobility grouping on. Total pages: 8096220
> [ 752.783234] Policy zone: Normal
> [ 752.808604] Demotion targets for Node 0: preferred: 1, fallback: 1
> [ 752.815509] Demotion targets for Node 1: null
>
> After:
> 2850000000-484fffffff : Soft Reserved
> 2850000000-484fffffff : CXL Window 1
> 2850000000-484fffffff : dax4.1
> 2850000000-484fffffff : System RAM (kmem)
>
> [4] A small hack to tear down the fully assembled and probed region
> (i.e region in committed state) for range 850000000-284fffffff.
> This is to test the region teardown path for regions which don't
> fully cover the Soft Reserved range.
>
> 850000000-284fffffff : Soft Reserved
> 850000000-284fffffff : CXL Window 0
> 850000000-284fffffff : dax5.0
> 850000000-284fffffff : System RAM (kmem)
> 2850000000-484fffffff : CXL Window 1
> 2850000000-484fffffff : region1
> 2850000000-484fffffff : dax1.0
> 2850000000-484fffffff : System RAM (kmem)
> .4850000000-684fffffff : CXL Window 2
> 4850000000-684fffffff : region2
> 4850000000-684fffffff : dax2.0
> 4850000000-684fffffff : System RAM (kmem)
>
> daxctl list -R -u
> [
> {
> "path":"\/platform\/ACPI0017:00\/root0\/decoder0.1\/region1\/dax_region1",
> "id":1,
> "size":"128.00 GiB (137.44 GB)",
> "align":2097152
> },
> {
> "path":"\/platform\/hmem.5",
> "id":5,
> "size":"128.00 GiB (137.44 GB)",
> "align":2097152
> },
> {
> "path":"\/platform\/ACPI0017:00\/root0\/decoder0.2\/region2\/dax_region2",
> "id":2,
> "size":"128.00 GiB (137.44 GB)",
> "align":2097152
> }
> ]
>
> I couldn't test multiple regions under same Soft Reserved range
> with/without contiguous mapping due to limiting BIOS support. Hopefully
> that works.
>
> v4 updates:
> - No changes patches 1-3.
> - New patches 4-7.
> - handle_deferred_cxl() has been enhanced to handle case where CXL
> regions do not contiguously and fully cover Soft Reserved ranges.
> - Support added to defer cxl_dax registration.
> - Support added to teardown cxl regions.
>
> v3 updates:
> - Fixed two "From".
>
> v2 updates:
> - Removed conditional check on CONFIG_EFI_SOFT_RESERVE as dax_hmem
> depends on CONFIG_EFI_SOFT_RESERVE. (Zhijian)
> - Added TODO note. (Zhijian)
> - Included region_intersects_soft_reserve() inside CONFIG_EFI_SOFT_RESERVE
> conditional check. (Zhijian)
> - insert_resource_late() -> insert_resource_expand_to_fit() and
> __insert_resource_expand_to_fit() replacement. (Boris)
> - Fixed Co-developed and Signed-off by. (Dan)
> - Combined 2/6 and 3/6 into a single patch. (Zhijian).
> - Skip local variable in remove_soft_reserved. (Jonathan)
> - Drop kfree with __free(). (Jonathan)
> - return 0 -> return dev_add_action_or_reset(host...) (Jonathan)
> - Dropped 6/6.
> - Reviewed-by tags (Dave, Jonathan)
>
> Dan Williams (4):
> dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is
> ready
> dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved
> ranges
> dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL
> dax/hmem: Defer handling of Soft Reserved ranges that overlap CXL
> windows
>
> Smita Koralahalli (5):
> cxl/region, dax/hmem: Arbitrate Soft Reserved ownership with
> cxl_regions_fully_map()
> cxl/region: Add register_dax flag to control probe-time devdax setup
> cxl/region, dax/hmem: Register devdax only when CXL owns Soft Reserved
> span
> cxl/region, dax/hmem: Tear down CXL regions when HMEM reclaims Soft
> Reserved
> dax/hmem: Reintroduce Soft Reserved ranges back into the iomem tree
>
> arch/x86/kernel/e820.c | 2 +-
> drivers/cxl/acpi.c | 2 +-
> drivers/cxl/core/region.c | 181 ++++++++++++++++++++++++++++++++++++--
> drivers/cxl/cxl.h | 17 ++++
> drivers/dax/Kconfig | 2 +
> drivers/dax/hmem/device.c | 4 +-
> drivers/dax/hmem/hmem.c | 137 ++++++++++++++++++++++++++---
> include/linux/ioport.h | 13 ++-
> kernel/resource.c | 92 ++++++++++++++++---
> 9 files changed, 415 insertions(+), 35 deletions(-)
>
> base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
> --
> 2.17.1
>
Powered by blists - more mailing lists