[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66311f5c59e15_10c212947f@dwillia2-mobl3.amr.corp.intel.com.notmuch>
Date: Tue, 30 Apr 2024 09:42:04 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Robert Richter <rrichter@....com>, "Rafael J. Wysocki"
<rafael@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar
<mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, <x86@...nel.org>, Andy Lutomirski
<luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Alison Schofield
<alison.schofield@...el.com>, Dan Williams <dan.j.williams@...el.com>
CC: <linux-acpi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-cxl@...r.kernel.org>, Robert Richter <rrichter@....com>, Derick Marks
<derick.w.marks@...el.com>, "H. Peter Anvin" <hpa@...or.com>, Len Brown
<lenb@...nel.org>
Subject: Re: [PATCH v6 1/7] x86/numa: Fix SRAT lookup of CFMWS ranges with
numa_fill_memblks()
Robert Richter wrote:
> For configurations that have the kconfig option NUMA_KEEP_MEMINFO
> disabled numa_fill_memblks() only returns with NUMA_NO_MEMBLK (-1).
> SRAT lookup fails then because an existing SRAT memory range cannot be
> found for a CFMWS address range. This causes the addition of a
> duplicate numa_memblk with a different node id and a subsequent page
> fault and kernel crash during boot.
>
> Fix this by making numa_fill_memblks() always available regardless of
> NUMA_KEEP_MEMINFO.
>
> The fix also removes numa_fill_memblks() from sparsemem.h using
> __weak.
>
> From Dan:
>
> """
> It just feels like numa_fill_memblks() has absolutely no business being
> defined in arch/x86/include/asm/sparsemem.h.
>
> The only use for numa_fill_memblks() is to arrange for NUMA nodes to be
> applied to memory ranges hot-onlined by the CXL driver.
>
> It belongs right next to numa_add_memblk(), and I suspect
> arch/x86/include/asm/sparsemem.h was only chosen to avoid figuring out
> what to do about the fact that linux/numa.h does not include asm/numa.h
> and that all implementations either provide numa_add_memblk() or select
> the generic implementation.
>
> So I would prefer that this do the proper fix and get
> numa_fill_memblks() completely out of the sparsemem.h path.
>
> Something like the following which boots for me.
> """
>
> Note that the issue was initially introduced with [1]. But since
> phys_to_target_node() was originally used that returned the valid node
> 0, an additional numa_memblk was not added. Though, the node id was
> wrong too, a message is seen then in the logs:
>
> kernel/numa.c: pr_info_once("Unknown target node for memory at 0x%llx, assuming node 0\n",
>
> [1] commit fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each
> CFMWS not in SRAT")
>
> Suggested-by: Dan Williams <dan.j.williams@...el.com>
> Link: https://lore.kernel.org/all/66271b0072317_69102944c@dwillia2-xfh.jf.intel.com.notmuch/
> Fixes: 8f1004679987 ("ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window")
> Cc: Derick Marks <derick.w.marks@...el.com>
> Cc: Dan Williams <dan.j.williams@...el.com>
> Cc: Alison Schofield <alison.schofield@...el.com>
> Signed-off-by: Robert Richter <rrichter@....com>
Looks good,
Reviewed-by: Dan Williams <dan.j.williams@...el.com>
Powered by blists - more mailing lists