[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4g5RoHhXhkKQaYkqYLN1y3KavbGeM1zVus-3fY5Q+JdxA@mail.gmail.com>
Date: Sat, 23 Mar 2019 10:21:30 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Yang Shi <yang.shi@...ux.alibaba.com>
Cc: Michal Hocko <mhocko@...e.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Rik van Riel <riel@...riel.com>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Keith Busch <keith.busch@...el.com>,
Fengguang Wu <fengguang.wu@...el.com>,
"Du, Fan" <fan.du@...el.com>, "Huang, Ying" <ying.huang@...el.com>,
Linux MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 01/10] mm: control memory placement by nodemask for two
tier main memory
On Fri, Mar 22, 2019 at 9:45 PM Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>
> When running applications on the machine with NVDIMM as NUMA node, the
> memory allocation may end up on NVDIMM node. This may result in silent
> performance degradation and regression due to the difference of hardware
> property.
>
> DRAM first should be obeyed to prevent from surprising regression. Any
> non-DRAM nodes should be excluded from default allocation. Use nodemask
> to control the memory placement. Introduce def_alloc_nodemask which has
> DRAM nodes set only. Any non-DRAM allocation should be specified by
> NUMA policy explicitly.
>
> In the future we may be able to extract the memory charasteristics from
> HMAT or other source to build up the default allocation nodemask.
> However, just distinguish DRAM and PMEM (non-DRAM) nodes by SRAT flag
> for the time being.
>
> Signed-off-by: Yang Shi <yang.shi@...ux.alibaba.com>
> ---
> arch/x86/mm/numa.c | 1 +
> drivers/acpi/numa.c | 8 ++++++++
> include/linux/mmzone.h | 3 +++
> mm/page_alloc.c | 18 ++++++++++++++++--
> 4 files changed, 28 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index dfb6c4d..d9e0ca4 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -626,6 +626,7 @@ static int __init numa_init(int (*init_func)(void))
> nodes_clear(numa_nodes_parsed);
> nodes_clear(node_possible_map);
> nodes_clear(node_online_map);
> + nodes_clear(def_alloc_nodemask);
> memset(&numa_meminfo, 0, sizeof(numa_meminfo));
> WARN_ON(memblock_set_node(0, ULLONG_MAX, &memblock.memory,
> MAX_NUMNODES));
> diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> index 867f6e3..79dfedf 100644
> --- a/drivers/acpi/numa.c
> +++ b/drivers/acpi/numa.c
> @@ -296,6 +296,14 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
> goto out_err_bad_srat;
> }
>
> + /*
> + * Non volatile memory is excluded from zonelist by default.
> + * Only regular DRAM nodes are set in default allocation node
> + * mask.
> + */
> + if (!(ma->flags & ACPI_SRAT_MEM_NON_VOLATILE))
> + node_set(node, def_alloc_nodemask);
Hmm, no, I don't think we should do this. Especially considering
current generation NVDIMMs are energy backed DRAM there is no
performance difference that should be assumed by the non-volatile
flag.
Why isn't default SLIT distance sufficient for ensuring a DRAM-first
default policy?
Powered by blists - more mailing lists