[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <67e4925eda2a5_13cb294fd@dwillia2-mobl3.amr.corp.intel.com.notmuch>
Date: Wed, 26 Mar 2025 19:48:46 -0400
From: Dan Williams <dan.j.williams@...el.com>
To: Yuquan Wang <wangyuquan1236@...tium.com.cn>,
<Jonathan.Cameron@...wei.com>, <dan.j.williams@...el.com>, <rppt@...nel.org>,
<akpm@...ux-foundation.org>, <david@...hat.com>, <bfaccini@...dia.com>,
<rafael@...nel.org>, <lenb@...nel.org>, <dave@...olabs.net>,
<dave.jiang@...el.com>, <alison.schofield@...el.com>,
<vishal.l.verma@...el.com>, <ira.weiny@...el.com>, <rrichter@....com>,
<haibo1.xu@...el.com>
CC: <linux-acpi@...r.kernel.org>, <linux-cxl@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <chenbaozi@...tium.com.cn>, Yuquan Wang
<wangyuquan1236@...tium.com.cn>
Subject: Re: [RFC PATCH v3 2/2] ACPI: NUMA: debug invalid unused PXM value
for CFMWs
Yuquan Wang wrote:
> The absence of SRAT would cause the fake_pxm to be -1 and increment
> to 0, then send to acpi_parse_cfmws(). If there exists CXL memory
> ranges that are defined in the CFMWS and not already defined in the
> SRAT, the new node (node0) for the CXL memory would be invalid, as
> node0 is already in "used", and all CXL memory might be online on
> node0.
It is still not clear to me why this is a problem. If there is no SRAT
and CXL is the first memory proximity domain in the system then it
should be 0.
In other words, if it is a problem that the kernel is picking node0 for
CXL memory when there is no SRAT, the problem is that there is no SRAT.
> This utilizes node_set(0, nodes_found_map) to set pxm&node map. With
> this setting, acpi_map_pxm_to_node() could return the expected node
> value even if no SRAT.
>
> If SRAT is valid, the numa_memblks_init() would then utilize
> numa_move_tail_memblk() to move the numa_memblk from numa_meminfo to
> numa_reserved_meminfo in CFMWs fake node situation.
>
> If SRAT is missing or bad, the numa_memblks_init() would fail since
> init_func() would fail. And it causes that no numa_memblk in
> numa_reserved_meminfo list and the following dax_cxl driver could
> find the expected fake node.
>
> Use numa_add_reserved_memblk() to replace numa_add_memblk(), since
> the cxl numa_memblk added by numa_add_memblk() would finally be moved
> to numa_reserved_meminfo, and numa_add_reserved_memblk() here could
> add cxl numa_memblk into reserved list directly. Hence, no matter
> SRAT is good or not, cxl numa_memblk could be allocated to reserved
> list.
Do you not have other problems due to numa_register_meminfo() not being
called?
I would really like to say that the platform is buggy without an SRAT
and you should not expect anything useful from a NUMA perspective on
such a platform. Everything showing up in node0 in that case sounds
right.
>
> Signed-off-by: Yuquan Wang <wangyuquan1236@...tium.com.cn>
> ---
> drivers/acpi/numa/srat.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> index 00ac0d7bb8c9..50bfecfb9c16 100644
> --- a/drivers/acpi/numa/srat.c
> +++ b/drivers/acpi/numa/srat.c
> @@ -458,11 +458,12 @@ static int __init acpi_parse_cfmws(union acpi_subtable_headers *header,
> return -EINVAL;
> }
>
> - if (numa_add_memblk(node, start, end) < 0) {
> + if (numa_add_reserved_memblk(node, start, end) < 0) {
This change can move to patch1 with the new justification I suggested.
...then we can have the pxm fixup discussion separately.
> /* CXL driver must handle the NUMA_NO_NODE case */
> pr_warn("ACPI NUMA: Failed to add memblk for CFMWS node %d [mem %#llx-%#llx]\n",
> node, start, end);
> }
> +
> node_set(node, numa_nodes_parsed);
>
> /* Set the next available fake_pxm value */
> @@ -646,8 +647,12 @@ int __init acpi_numa_init(void)
> if (node_to_pxm_map[i] > fake_pxm)
> fake_pxm = node_to_pxm_map[i];
> }
> - last_real_pxm = fake_pxm;
> - fake_pxm++;
> +
> + /* Make sure CFMWs fake node >= 1 */
> + fake_pxm = max(fake_pxm, 0);
> + last_real_pxm = fake_pxm++;
> + node_set(0, nodes_found_map);
> +
> acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_parse_cfmws,
> &fake_pxm);
>
> --
> 2.34.1
>
Powered by blists - more mailing lists