[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251120031925.87762-1-Smita.KoralahalliChannabasappa@amd.com>
Date: Thu, 20 Nov 2025 03:19:16 +0000
From: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
To: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<nvdimm@...ts.linux.dev>, <linux-fsdevel@...r.kernel.org>,
<linux-pm@...r.kernel.org>
CC: Alison Schofield <alison.schofield@...el.com>, Vishal Verma
<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Dan Williams
<dan.j.williams@...el.com>, Jonathan Cameron <jonathan.cameron@...wei.com>,
Yazen Ghannam <yazen.ghannam@....com>, Dave Jiang <dave.jiang@...el.com>,
Davidlohr Bueso <dave@...olabs.net>, Matthew Wilcox <willy@...radead.org>,
Jan Kara <jack@...e.cz>, "Rafael J . Wysocki" <rafael@...nel.org>, Len Brown
<len.brown@...el.com>, Pavel Machek <pavel@...nel.org>, Li Ming
<ming.li@...omail.com>, Jeff Johnson <jeff.johnson@....qualcomm.com>, "Ying
Huang" <huang.ying.caritas@...il.com>, Yao Xingtao <yaoxt.fnst@...itsu.com>,
Peter Zijlstra <peterz@...radead.org>, Greg KH <gregkh@...uxfoundation.org>,
Nathan Fontenot <nathan.fontenot@....com>, Terry Bowman
<terry.bowman@....com>, Robert Richter <rrichter@....com>, Benjamin Cheatham
<benjamin.cheatham@....com>, Zhijian Li <lizhijian@...itsu.com>, "Borislav
Petkov" <bp@...en8.de>, Ard Biesheuvel <ardb@...nel.org>
Subject: [PATCH v4 0/9] dax/hmem, cxl: Coordinate Soft Reserved handling with CXL and HMEM
This series aims to address long-standing conflicts between HMEM and
CXL when handling Soft Reserved memory ranges.
Reworked from Dan's patch:
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/patch/?id=ab70c6227ee6165a562c215d9dcb4a1c55620d5d
Previous work:
https://lore.kernel.org/all/20250715180407.47426-1-Smita.KoralahalliChannabasappa@amd.com/
Link to v3:
https://lore.kernel.org/all/20250930044757.214798-1-Smita.KoralahalliChannabasappa@amd.com
This series should be applied on top of:
"214291cbaace: acpi/hmat: Fix lockdep warning for hmem_register_resource()"
and is based on:
base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
I initially tried picking up the three probe ordering patches from v20/v21
of Type 2 support, but I hit a NULL pointer dereference in
devm_cxl_add_memdev() and cycle dependency with all patches so I left
them out for now. With my current series rebased on 6.18-rc2 plus
214291cbaace, probe ordering behaves correctly on AMD systems and I have
verified the scenarios mentioned below. I can pull those three patches
back in for a future revision once the failures are sorted out.
Probe order patches of interest:
cxl/mem: refactor memdev allocation
cxl/mem: Arrange for always-synchronous memdev attach
cxl/port: Arrange for always synchronous endpoint attach
[1] Hotplug looks okay. After offlining the memory I can tear down the
regions and recreate it back if CXL owns entire SR range as Soft Reserved
is gone. dax_cxl creates dax devices and onlines memory.
850000000-284fffffff : CXL Window 0
850000000-284fffffff : region0
850000000-284fffffff : dax0.0
850000000-284fffffff : System RAM (kmem)
[2] With CONFIG_CXL_REGION disabled, all the resources are handled by
HMEM. Soft Reserved range shows up in /proc/iomem, no regions come up
and dax devices are created from HMEM.
850000000-284fffffff : CXL Window 0
850000000-284fffffff : Soft Reserved
850000000-284fffffff : dax0.0
850000000-284fffffff : System RAM (kmem)
[3] Region assembly failures also behave okay and work same as [2].
Before:
2850000000-484fffffff : Soft Reserved
2850000000-484fffffff : CXL Window 1
2850000000-484fffffff : dax4.0
2850000000-484fffffff : System RAM (kmem)
After tearing down dax4.0 and creating it back:
Logs:
[ 547.847764] unregister_dax_mapping: mapping0: unregister_dax_mapping
[ 547.855000] trim_dev_dax_range: dax dax4.0: delete range[0]: 0x2850000000:0x484fffffff
[ 622.474580] alloc_dev_dax_range: dax dax4.1: alloc range[0]: 0x0000002850000000:0x000000484fffffff
[ 752.766194] Fallback order for Node 0: 0 1
[ 752.766199] Fallback order for Node 1: 1 0
[ 752.766200] Built 2 zonelists, mobility grouping on. Total pages: 8096220
[ 752.783234] Policy zone: Normal
[ 752.808604] Demotion targets for Node 0: preferred: 1, fallback: 1
[ 752.815509] Demotion targets for Node 1: null
After:
2850000000-484fffffff : Soft Reserved
2850000000-484fffffff : CXL Window 1
2850000000-484fffffff : dax4.1
2850000000-484fffffff : System RAM (kmem)
[4] A small hack to tear down the fully assembled and probed region
(i.e region in committed state) for range 850000000-284fffffff.
This is to test the region teardown path for regions which don't
fully cover the Soft Reserved range.
850000000-284fffffff : Soft Reserved
850000000-284fffffff : CXL Window 0
850000000-284fffffff : dax5.0
850000000-284fffffff : System RAM (kmem)
2850000000-484fffffff : CXL Window 1
2850000000-484fffffff : region1
2850000000-484fffffff : dax1.0
2850000000-484fffffff : System RAM (kmem)
.4850000000-684fffffff : CXL Window 2
4850000000-684fffffff : region2
4850000000-684fffffff : dax2.0
4850000000-684fffffff : System RAM (kmem)
daxctl list -R -u
[
{
"path":"\/platform\/ACPI0017:00\/root0\/decoder0.1\/region1\/dax_region1",
"id":1,
"size":"128.00 GiB (137.44 GB)",
"align":2097152
},
{
"path":"\/platform\/hmem.5",
"id":5,
"size":"128.00 GiB (137.44 GB)",
"align":2097152
},
{
"path":"\/platform\/ACPI0017:00\/root0\/decoder0.2\/region2\/dax_region2",
"id":2,
"size":"128.00 GiB (137.44 GB)",
"align":2097152
}
]
I couldn't test multiple regions under same Soft Reserved range
with/without contiguous mapping due to limiting BIOS support. Hopefully
that works.
v4 updates:
- No changes patches 1-3.
- New patches 4-7.
- handle_deferred_cxl() has been enhanced to handle case where CXL
regions do not contiguously and fully cover Soft Reserved ranges.
- Support added to defer cxl_dax registration.
- Support added to teardown cxl regions.
v3 updates:
- Fixed two "From".
v2 updates:
- Removed conditional check on CONFIG_EFI_SOFT_RESERVE as dax_hmem
depends on CONFIG_EFI_SOFT_RESERVE. (Zhijian)
- Added TODO note. (Zhijian)
- Included region_intersects_soft_reserve() inside CONFIG_EFI_SOFT_RESERVE
conditional check. (Zhijian)
- insert_resource_late() -> insert_resource_expand_to_fit() and
__insert_resource_expand_to_fit() replacement. (Boris)
- Fixed Co-developed and Signed-off by. (Dan)
- Combined 2/6 and 3/6 into a single patch. (Zhijian).
- Skip local variable in remove_soft_reserved. (Jonathan)
- Drop kfree with __free(). (Jonathan)
- return 0 -> return dev_add_action_or_reset(host...) (Jonathan)
- Dropped 6/6.
- Reviewed-by tags (Dave, Jonathan)
Dan Williams (4):
dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is
ready
dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved
ranges
dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL
dax/hmem: Defer handling of Soft Reserved ranges that overlap CXL
windows
Smita Koralahalli (5):
cxl/region, dax/hmem: Arbitrate Soft Reserved ownership with
cxl_regions_fully_map()
cxl/region: Add register_dax flag to control probe-time devdax setup
cxl/region, dax/hmem: Register devdax only when CXL owns Soft Reserved
span
cxl/region, dax/hmem: Tear down CXL regions when HMEM reclaims Soft
Reserved
dax/hmem: Reintroduce Soft Reserved ranges back into the iomem tree
arch/x86/kernel/e820.c | 2 +-
drivers/cxl/acpi.c | 2 +-
drivers/cxl/core/region.c | 181 ++++++++++++++++++++++++++++++++++++--
drivers/cxl/cxl.h | 17 ++++
drivers/dax/Kconfig | 2 +
drivers/dax/hmem/device.c | 4 +-
drivers/dax/hmem/hmem.c | 137 ++++++++++++++++++++++++++---
include/linux/ioport.h | 13 ++-
kernel/resource.c | 92 ++++++++++++++++---
9 files changed, 415 insertions(+), 35 deletions(-)
base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
--
2.17.1
Powered by blists - more mailing lists