[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1e2046f3-6fe5-432f-b5e8-a9d9be99e7cd@fujitsu.com>
Date: Mon, 21 Jul 2025 07:38:11 +0000
From: "Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>
To: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>
CC: Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron
<jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>, Alison
Schofield <alison.schofield@...el.com>, Vishal Verma
<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Dan Williams
<dan.j.williams@...el.com>, Matthew Wilcox <willy@...radead.org>, Jan Kara
<jack@...e.cz>, "Rafael J . Wysocki" <rafael@...nel.org>, Len Brown
<len.brown@...el.com>, Pavel Machek <pavel@...nel.org>, Li Ming
<ming.li@...omail.com>, Jeff Johnson <jeff.johnson@....qualcomm.com>, Ying
Huang <huang.ying.caritas@...il.com>, "Xingtao Yao (Fujitsu)"
<yaoxt.fnst@...itsu.com>, Peter Zijlstra <peterz@...radead.org>, Greg KH
<gregkh@...uxfoundation.org>, Nathan Fontenot <nathan.fontenot@....com>,
Terry Bowman <terry.bowman@....com>, Robert Richter <rrichter@....com>,
Benjamin Cheatham <benjamin.cheatham@....com>, PradeepVineshReddy Kodamati
<PradeepVineshReddy.Kodamati@....com>
Subject: Re: [PATCH v5 0/7] Add managed SOFT RESERVE resource handling
Smita,
I have not yet to complete all of my local patterns. Nonetheless, in addition to the issues highlighted by Alison, I have also encountered some regressions.
Based on your conversation with Alison, it appears you have decided to have a refactor. Thus, I intend to stop testing on this version until the updated iteration is available.
Here is what I have verified thus far (kernel built upon the cxl/next 20250718):
A) No Soft reserved (BIOS did not expose EFI_SPECIAL_PURPOSE)
- A.1 Decoder not committed (default QEMU emulation)
Before:
```
fffc0000-ffffffff : Reserved
100000000-27fffffff : System RAM
5c0001128-5c00011b7 : port1
5d0000000-6cfffffff : CXL Window 0
6d0000000-7cfffffff : CXL Window 1
7000000000-700000ffff : PCI Bus 0000:0c
7000000000-700000ffff : 0000:0c:00.0
7000010000-700001ffff : PCI Bus 0000:0e
7000010000-700001ffff : 0000:0e:00.0
7000011080-70000110d7 : mem0
```
After (CXL window is absent):
```
fed00000-fed003ff : PNP0103:00
fed1c000-fed1ffff : Reserved
feffc000-feffffff : Reserved
fffc0000-ffffffff : Reserved
100000000-27fffffff : System RAM
7000000000-700000ffff : PCI Bus 0000:0c
7000000000-700000ffff : 0000:0c:00.0
7000010000-700001ffff : PCI Bus 0000:0e
7000010000-700001ffff : 0000:0e:00.0
7000020000-703fffffff : PCI Bus 0000:00
```
- A.2 Decoder is committed
Before:
```
100000000-27fffffff : System RAM
5c0001128-5c00011b7 : port1
5d0000000-6cfffffff : CXL Window 0
5d0000000-6cfffffff : region0
5d0000000-6cfffffff : dax0.0
5d0000000-6cfffffff : System RAM (kmem)
7000000000-700000ffff : PCI Bus 0000:0c
7000000000-700000ffff : 0000:0c:00.0
```
After (CXL window is absent):
```
feffc000-feffffff : Reserved
fffc0000-ffffffff : Reserved
100000000-27fffffff : System RAM
7000000000-700000ffff : PCI Bus 0000:0c
7000000000-700000ffff : 0000:0c:00.0
7000010000-700001ffff : PCI Bus 0000:0e
7000010000-700001ffff : 0000:0e:00.0
7000020000-703fffffff : PCI Bus 0000:00
```
B) EFI_SPECIAL_PURPOSE is set
- B.1 Decoder not committed
Before:
```
5d0000000-7cfffffff : Soft Reserved
5d0000000-6cfffffff : CXL Window 0
6d0000000-7cfffffff : CXL Window 1
```
After (fallback to hmem):
```
5d0000000-7cfffffff : Soft Reserved
5d0000000-7cfffffff : dax0.0
5d0000000-7cfffffff : System RAM (kmem)
```
- B.2 Decoder is committed
Before:
```
5d0000000-6cfffffff : CXL Window 0
5d0000000-6cfffffff : region0
5d0000000-6cfffffff : Soft Reserved
5d0000000-6cfffffff : dax0.0
5d0000000-6cfffffff : System RAM (kmem)
```
After (fallback to hmem):
```
5d0000000-6cfffffff : Soft Reserved
5d0000000-6cfffffff : dax0.0
5d0000000-6cfffffff : System RAM (kmem)
```
Thanks
Zhijian
On 16/07/2025 02:04, Smita Koralahalli wrote:
> This series introduces the ability to manage SOFT RESERVED iomem
> resources, enabling the CXL driver to remove any portions that
> intersect with created CXL regions.
>
> The current approach of leaving SOFT RESERVED entries as is can result
> in failures during device hotplug such as CXL because the address range
> remains reserved and unavailable for reuse even after region teardown.
>
> To address this, the CXL driver now uses a background worker that waits
> for cxl_mem driver probe to complete before scanning for intersecting
> resources. Then the driver walks through created CXL regions to trim any
> intersections with SOFT RESERVED resources in the iomem tree.
>
> The following scenarios have been tested:
>
> Example 1: Exact alignment, soft reserved is a child of the region
>
> |---------- "Soft Reserved" -----------|
> |-------------- "Region #" ------------|
>
> Before:
> 1050000000-304fffffff : CXL Window 0
> 1050000000-304fffffff : region0
> 1050000000-304fffffff : Soft Reserved
> 1080000000-2fffffffff : dax0.0
> 1080000000-2fffffffff : System RAM (kmem)
>
> After:
> 1050000000-304fffffff : CXL Window 0
> 1050000000-304fffffff : region0
> 1080000000-2fffffffff : dax0.0
> 1080000000-2fffffffff : System RAM (kmem)
>
> Example 2: Start and/or end aligned and soft reserved spans multiple
> regions
> |----------- "Soft Reserved" -----------|
> |-------- "Region #" -------|
> or
> |----------- "Soft Reserved" -----------|
> |-------- "Region #" -------|
>
> Before:
> 850000000-684fffffff : Soft Reserved
> 850000000-284fffffff : CXL Window 0
> 850000000-284fffffff : region3
> 850000000-284fffffff : dax0.0
> 850000000-284fffffff : System RAM (kmem)
> 2850000000-484fffffff : CXL Window 1
> 2850000000-484fffffff : region4
> 2850000000-484fffffff : dax1.0
> 2850000000-484fffffff : System RAM (kmem)
> 4850000000-684fffffff : CXL Window 2
> 4850000000-684fffffff : region5
> 4850000000-684fffffff : dax2.0
> 4850000000-684fffffff : System RAM (kmem)
>
> After:
> 850000000-284fffffff : CXL Window 0
> 850000000-284fffffff : region3
> 850000000-284fffffff : dax0.0
> 850000000-284fffffff : System RAM (kmem)
> 2850000000-484fffffff : CXL Window 1
> 2850000000-484fffffff : region4
> 2850000000-484fffffff : dax1.0
> 2850000000-484fffffff : System RAM (kmem)
> 4850000000-684fffffff : CXL Window 2
> 4850000000-684fffffff : region5
> 4850000000-684fffffff : dax2.0
> 4850000000-684fffffff : System RAM (kmem)
>
> Example 3: No alignment
> |---------- "Soft Reserved" ----------|
> |---- "Region #" ----|
>
> Before:
> 00000000-3050000ffd : Soft Reserved
> ..
> ..
> 1050000000-304fffffff : CXL Window 0
> 1050000000-304fffffff : region1
> 1080000000-2fffffffff : dax0.0
> 1080000000-2fffffffff : System RAM (kmem)
>
> After:
> 00000000-104fffffff : Soft Reserved
> ..
> ..
> 1050000000-304fffffff : CXL Window 0
> 1050000000-304fffffff : region1
> 1080000000-2fffffffff : dax0.0
> 1080000000-2fffffffff : System RAM (kmem)
> 3050000000-3050000ffd : Soft Reserved
>
> Link to v4:
> https://lore.kernel.org/linux-cxl/20250603221949.53272-1-Smita.KoralahalliChannabasappa@amd.com
>
> v5 updates:
> - Handled cases where CXL driver loads early even before HMEM driver is
> initialized.
> - Introduced callback functions to resolve dependencies.
> - Rename suspend.c to probe_state.c.
> - Refactor cxl_acpi_probe() to use a single exit path.
> - Commit description update to justify cxl_mem_active() usage.
> - Change from kmalloc -> kzalloc in add_soft_reserved().
> - Change from goto to if else blocks inside remove_soft_reserved().
> - DEFINE_RES_MEM_NAMED -> DEFINE_RES_NAMED_DESC.
> - Comments for flags inside remove_soft_reserved().
> - Add resource_lock inside normalize_resource().
> - bus_find_next_device -> bus_find_device.
> - Skip DAX consumption of soft reserves inside hmat with
> CONFIG_CXL_ACPI checks.
>
> v4 updates:
> - Split first patch into 4 smaller patches.
> - Correct the logic for cxl_pci_loaded() and cxl_mem_active() to return
> false at default instead of true.
> - Cleanup cxl_wait_for_pci_mem() to remove config checks for cxl_pci
> and cxl_mem.
> - Fixed multiple bugs and build issues which includes correcting
> walk_iomem_resc_desc() and calculations of alignments.
>
> v3 updates:
> - Remove srmem resource tree from kernel/resource.c, this is no longer
> needed in the current implementation. All SOFT RESERVE resources now
> put on the iomem resource tree.
> - Remove the no longer needed SOFT_RESERVED_MANAGED kernel config option.
> - Add the 'nid' parameter back to hmem_register_resource();
> - Remove the no longer used soft reserve notification chain (introduced
> in v2). The dax driver is now notified of SOFT RESERVED resources by
> the CXL driver.
>
> v2 updates:
> - Add config option SOFT_RESERVE_MANAGED to control use of the
> separate srmem resource tree at boot.
> - Only add SOFT RESERVE resources to the soft reserve tree during
> boot, they go to the iomem resource tree after boot.
> - Remove the resource trimming code in the previous patch to re-use
> the existing code in kernel/resource.c
> - Add functionality for the cxl acpi driver to wait for the cxl PCI
> and mem drivers to load.
>
> Smita Koralahalli (7):
> cxl/acpi: Refactor cxl_acpi_probe() to always schedule fallback DAX
> registration
> cxl/core: Rename suspend.c to probe_state.c and remove
> CONFIG_CXL_SUSPEND
> cxl/acpi: Add background worker to coordinate with cxl_mem probe
> completion
> cxl/region: Introduce SOFT RESERVED resource removal on region
> teardown
> dax/hmem: Save the DAX HMEM platform device pointer
> dax/hmem, cxl: Defer DAX consumption of SOFT RESERVED resources until
> after CXL region creation
> dax/hmem: Preserve fallback SOFT RESERVED regions if DAX HMEM loads
> late
>
> drivers/acpi/numa/hmat.c | 4 +
> drivers/cxl/Kconfig | 4 -
> drivers/cxl/acpi.c | 50 +++++--
> drivers/cxl/core/Makefile | 2 +-
> drivers/cxl/core/{suspend.c => probe_state.c} | 10 +-
> drivers/cxl/core/region.c | 135 ++++++++++++++++++
> drivers/cxl/cxl.h | 4 +
> drivers/cxl/cxlmem.h | 9 --
> drivers/dax/hmem/Makefile | 1 +
> drivers/dax/hmem/device.c | 62 ++++----
> drivers/dax/hmem/hmem.c | 14 +-
> drivers/dax/hmem/hmem_notify.c | 29 ++++
> include/linux/dax.h | 7 +-
> include/linux/ioport.h | 1 +
> include/linux/pm.h | 7 -
> kernel/resource.c | 34 +++++
> 16 files changed, 307 insertions(+), 66 deletions(-)
> rename drivers/cxl/core/{suspend.c => probe_state.c} (62%)
> create mode 100644 drivers/dax/hmem/hmem_notify.c
>
Powered by blists - more mailing lists