[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d4b9402e-7dc9-4933-bded-0d92f4aeb064@amd.com>
Date: Tue, 4 Nov 2025 18:59:57 -0800
From: "Koralahalli Channabasappa, Smita" <skoralah@....com>
To: Tomasz Wolski <tomasz.wolski@...itsu.com>, alison.schofield@...el.com,
Dan Williams <dan.j.williams@...el.com>
Cc: Smita.KoralahalliChannabasappa@....com, ardb@...nel.org,
benjamin.cheatham@....com, bp@...en8.de, dan.j.williams@...el.com,
dave.jiang@...el.com, dave@...olabs.net, gregkh@...uxfoundation.org,
huang.ying.caritas@...il.com, ira.weiny@...el.com, jack@...e.cz,
jeff.johnson@....qualcomm.com, jonathan.cameron@...wei.com,
len.brown@...el.com, linux-cxl@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org, lizhijian@...itsu.com, ming.li@...omail.com,
nathan.fontenot@....com, nvdimm@...ts.linux.dev, pavel@...nel.org,
peterz@...radead.org, rafael@...nel.org, rrichter@....com,
terry.bowman@....com, vishal.l.verma@...el.com, willy@...radead.org,
yaoxt.fnst@...itsu.com
Subject: Re: [PATCH v3 0/5] dax/hmem, cxl: Coordinate Soft Reserved handling
with CXL
Hi Tomasz,
On 11/3/2025 3:18 AM, Tomasz Wolski wrote:
> Hi Alison and Smita,
>
> I’ve been following your patch proposal and testing it on a few QEMU setups
>
>> Will it work to search directly for the region above by using params
>> IORESOURCE_MEM, IORES_DESC_NONE. This way we only get region conflicts,
>> no empty windows to examine. I think that might replace cxl_region_exists()
>> work below.
>
> I see expected 'dropping CXL range' message (case when region covers full CXL window)
>
> [ 31.783945] hmem_platform hmem_platform.0: deferring range to CXL: [mem 0xa90000000-0xb8fffffff flags 0x80000200]
> [ 31.784609] deferring range to CXL: [mem 0xa90000000-0xb8fffffff flags 0x80000200]
> [ 31.790588] hmem_platform hmem_platform.0: dropping CXL range: [mem 0xa90000000-0xb8fffffff flags 0x80000200]
> [ 31.791102] dropping CXL range: [mem 0xa90000000-0xb8fffffff flags 0x80000200]
>
> a90000000-b8fffffff : CXL Window 0
> a90000000-b8fffffff : region0
> a90000000-b8fffffff : dax0.0
> a90000000-b8fffffff : System RAM (kmem)
>
> [ 31.384899] hmem_platform hmem_platform.0: deferring range to CXL: [mem 0xa90000000-0xc8fffffff flags 0x80000200]
> [ 31.385586] deferring range to CXL: [mem 0xa90000000-0xc8fffffff flags 0x80000200]
> [ 31.391107] hmem_platform hmem_platform.0: dropping CXL range: [mem 0xa90000000-0xc8fffffff flags 0x80000200]
> [ 31.391676] dropping CXL range: [mem 0xa90000000-0xc8fffffff flags 0x80000200]
>
> a90000000-c8fffffff : CXL Window 0
> a90000000-b8fffffff : region0
> a90000000-b8fffffff : dax0.0
> a90000000-b8fffffff : System RAM (kmem)
> b90000000-c8fffffff : region1
> b90000000-c8fffffff : dax1.0
> b90000000-c8fffffff : System RAM (kmem)
>
> a90000000-b8fffffff : CXL Window 0
> a90000000-b8fffffff : region0
> a90000000-b8fffffff : dax0.0
> a90000000-b8fffffff : System RAM (kmem)
> b90000000-c8fffffff : CXL Window 1
> b90000000-c8fffffff : region1
> b90000000-c8fffffff : dax1.0
> b90000000-c8fffffff : System RAM (kmem)
>
> However, when testing version with cxl_region_exists() I didn't see expected 'registering CXL range' message
> when the CXL region does not fully occupy CXL window - please see below.
> I should mention that I’m still getting familiar with CXL internals, so maybe I might be missing some context :)
>
> a90000000-bcfffffff : CXL Window 0
> a90000000-b8fffffff : region0
> a90000000-b8fffffff : dax0.0
> a90000000-b8fffffff : System RAM (kmem)
>
> [ 30.434385] hmem_platform hmem_platform.0: deferring range to CXL: [mem 0xa90000000-0xbcfffffff flags 0x80000200]
> [ 30.435116] deferring range to CXL: [mem 0xa90000000-0xbcfffffff flags 0x80000200]
> [ 30.436530] hmem_platform hmem_platform.0: dropping CXL range: [mem 0xa90000000-0xbcfffffff flags 0x80000200]
> [ 30.437070] hmem_platform hmem_platform.0: dropping CXL range: [mem 0xa90000000-0xbcfffffff flags 0x80000200]
> [ 30.437599] dropping CXL range: [mem 0xa90000000-0xbcfffffff flags 0x80000200]
Thanks for testing and sharing the logs.
After off-list discussion with Alison and Dan (please jump in if I’m
misrepresenting anything)
Ownership is determined by CXL regions, not window sizing. A CXL Window
may be larger or smaller than the Soft Reserved (SR) span and that
should not affect the decision.
Key thing to check is: Do the CXL regions fully and contiguously cover
the entire Soft Reserved range?
Yes - CXL owns SR (“dropping CXL range”).
No - CXL must give up SR (“registering CXL range”). More on giving up SR
below.
The previous child->start <= start && child->end <= end check needs to
be replaced with a full coverage test:
1. Decide ownership based on region coverage: We check whether all CXL
regions together fully and contiguously cover the "given" SR range.
If fully covered - CXL owns it.
If not fully covered - CXL must give up and the SR is owned by HMEM.
2. If CXL must give up - Remove the CXL regions that overlap SR before
registering the SR via hmem_register_device().
3. Ensure dax_kmem never onlines memory until after this decision.
dax_kmem must always probe after dax_hmem decides ownership.
Some of the valid configs (CXL owns: drop CXL range)
1.3ff0d0000000-3ff10fffffff : SR
3ff0d0000000-3ff10fffffff : Window 1
3ff0d0000000-3ff0dfffffff : region1
3ff0e0000000-3ff0efffffff : region2
3ff0f0000000-3ff0ffffffff : region3
3ff100000000-3ff10fffffff : region4
2. 3ff0d0000000-3ff10fffffff : Window 1
3ff0d0000000-3ff0dfffffff : SR
3ff0d0000000-3ff0dfffffff : region1
3ff0e0000000-3ff0efffffff : SR
3ff0e0000000-3ff0efffffff : region2
3ff0f0000000-3ff0ffffffff : SR
3ff0f0000000-3ff0ffffffff : region3
3ff100000000-3ff10fffffff : SR
3ff100000000-3ff10fffffff : region4
3. 3ff0d0000000-3ff20fffffff : Window 1
3ff0d0000000-3ff10fffffff : SR
3ff0d0000000-3ff0dfffffff : region1
3ff0e0000000-3ff0efffffff : region2
3ff0f0000000-3ff0ffffffff : region3
3ff100000000-3ff10fffffff : region4
4. 3ff0d0000000-3ff10fffffff : SR
3ff0d0000000-3ff10fffffff : Window 1
3ff0d0000000-3ff10fffffff : region1
Invalid configs (HMEM owns: registering CXL range)
1. 3ff0d0000000-3ff20fffffff : SR
3ff0d0000000-3ff20fffffff : Window 1
3ff0d0000000-3ff10fffffff : region1
2. 3ff0d0000000-3ff20fffffff : SR
3ff0d0000000-3ff10fffffff : Window 1
3ff0d0000000-3ff0dfffffff : region1
3ff0e0000000-3ff0efffffff : region2
3ff0f0000000-3ff0ffffffff : region3
3ff100000000-3ff10fffffff : region4
3. region2 assembly failed or incorrect BIOS config
3ff0d0000000-3ff10fffffff : SR
3ff0d0000000-3ff10fffffff : Window 1
3ff0d0000000-3ff0dfffffff : region1
3ff0f0000000-3ff0ffffffff : region3
3ff100000000-3ff10fffffff : region4
I will work on incorporating the 3 steps mentioned above.
Thanks
Smita
>
> Thanks,
> Tomasz
Powered by blists - more mailing lists