[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2d1e23ad-7ec1-483b-88b3-70ce19b69106@phytium.com.cn>
Date: Thu, 22 Jan 2026 16:03:49 +0800
From: Cui Chao <cuichao1753@...tium.com.cn>
To: dan.j.williams@...el.com, Andrew Morton <akpm@...ux-foundation.org>
Cc: Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Mike Rapoport <rppt@...nel.org>, Wang Yinfeng <wangyinfeng@...tium.com.cn>,
linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2 1/1] mm: numa_memblks: Identify the accurate NUMA ID of
CFMW
On 1/16/2026 3:50 AM, dan.j.williams@...el.com wrote:
> Andrew Morton wrote:
>> On Thu, 15 Jan 2026 17:43:02 +0800 Cui Chao <cuichao1753@...tium.com.cn> wrote:
>>
>>> When a CXL RAM region is created in userspace, the memory capacity of
>>> the newly created region is not added to the CFMW-dedicated NUMA node.
>>> Instead, it is accumulated into an existing NUMA node (e.g., NUMA0
>>> containing RAM). This makes it impossible to clearly distinguish between
>>> the two types of memory, which may affect memory-tiering applications.
>>>
>> OK, thanks, I added this to the changelog. Please retain it when
>> sending v3.
>>
>> What I'm actually looking for here are answers to the questions
>>
>> Should we backport this into -stable kernels and if so, why?
>> And if not, why not?
>>
>> So a very complete description of the runtime effects really helps
>> myself and others to decide which kernels to patch. And it helps
>> people to understand *why* we made that decision.
>>
>> And sorry, but "may affect memory-tiering applications" isn't very
>> complete!
>>
>> So please, tell us how much our users are hurting from this and please
>> make a recommendation on the backporting decision.
>>
> To add on here, Cui, please describe which shipping hardware platforms
> in the wild create physical address maps like this. For example, if this
> is something that only occurs in QEMU configurations or similar, then
> the urgency is low and it is debatable if Linux should even worry about
> fixing it.
>
> I know that x86 platforms typically do not do this. It is also
> within the realm of possibility for platform firmware to fix. So in
> addition to platform impact please also clarify why folks can not just
> ask for a firmware update to get this fixed without updating their
> kernel.
Andrew, Dan, thank you for your review.
1.Issue Impact and Backport Recommendation:
This patch fixes an issue on hardware platforms (not QEMU emulation)
where, during the dynamic creation of a CXL RAM region, the memory
capacity is not assigned to the correct CFMW-dedicated NUMA node. This
issue leads to:
*
Failure of the memory tiering mechanism: The system is designed to
treat System RAM as fast memory and CXL memory as slow memory. For
performance optimization, hot pages may be migrated to fast memory
while cold pages are migrated to slow memory. The system uses NUMA
IDs as an index to identify different tiers of memory. If the NUMA
ID for CXL memory is calculated incorrectly and its capacity is
aggregated into the NUMA node containing System RAM (i.e., the node
for fast memory), the CXL memory cannot be correctly identified. It
may be misjudged as fast memory, thereby affecting performance
optimization strategies.
*
Inability to distinguish between System RAM and CXL memory even for
simple manual binding: Tools like |numactl|and other NUMA policy
utilities cannot differentiate between System RAM and CXL memory,
making it impossible to perform reasonable memory binding.
*
Inaccurate system reporting: Tools like |numactl -H|would display
memory capacities that do not match the actual physical hardware
layout, impacting operations and monitoring.
This issue affects all users utilizing the CXL RAM functionality who
rely on memory tiering or NUMA-aware scheduling. Such configurations are
becoming increasingly common in data centers, cloud computing, and
high-performance computing scenarios.
Therefore, I recommend backporting this patch to all stable kernel
series that support dynamic CXL region creation.
2.Why a Kernel Update is Recommended Over a Firmware Update:
In the scenario of dynamic CXL region creation, the association between
the memory's HPA range and its corresponding NUMA node is established
when the kernel driver performs the commit operation. This is a runtime,
OS-managed operation where the platform firmware cannot intervene to
provide a fix.
Considering factors like hardware platform architecture, memory
resources, and others, such a physical address layout can indeed occur.
This patch does not introduce risk; it simply correctly handles the NUMA
node assignment for CXL RAM regions within such a physical address layout.
Thus, I believe a kernel fix is necessary.
--
Best regards,
Cui Chao.
Powered by blists - more mailing lists