[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z2hGWoqZqwxJC4gM@gourry-fedora-PF4VCD3F>
Date: Sun, 22 Dec 2024 12:03:22 -0500
From: Gregory Price <gourry@...rry.net>
To: "Huang, Ying" <ying.huang@...ux.alibaba.com>
Cc: Hyeonggon Yoo <hyeonggon.yoo@...com>,
Joshua Hahn <joshua.hahnjy@...il.com>,
"gourry@...rry.net" <gourry@...rry.net>, kernel_team@...ynix.com,
42.hyeyoo@...il.com, "rafael@...nel.org" <rafael@...nel.org>,
"lenb@...nel.org" <lenb@...nel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
Honggyu Kim <honggyu.kim@...com>, Rakie Kim <rakie.kim@...com>,
"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"Jonathan.Cameron@...wei.com" <Jonathan.Cameron@...wei.com>,
"dave.jiang@...el.com" <dave.jiang@...el.com>,
"horen.chuang@...ux.dev" <horen.chuang@...ux.dev>,
"hannes@...xchg.org" <hannes@...xchg.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"kernel-team@...a.com" <kernel-team@...a.com>
Subject: Re: [External Mail] [RFC PATCH v2] Weighted interleave auto-tuning
On Sun, Dec 22, 2024 at 03:21:32PM +0800, Huang, Ying wrote:
> Hyeonggon Yoo <hyeonggon.yoo@...com> writes:
>
> > On this server, ideally weighted interleaving should be configured
> > within a socket (e.g. local NUMA node + local CXL node) because
> > weighted interleaving does not consider the bandwidth when accessed
> > from a remote socket.
>
> If multiple sockets are considered, what is the best behavior?
>
> The process may be cross-socket too. So, we will need to use
> set_mempolicy() to bind tasks to sockets firstly. Then, it may be
> better to use per-task weights.
>
If we want to revisit this, we might be able to make task-local weights
work without a new syscall, but the use case was not clear enough which
is why it was soft-nak'd originally.
vma-local weights are arguably more usable, but require the task to be
numa-aware and probably require a new mempolicy syscall because mbind
has no remaining arguments.
recall my original testing results from stream:
https://lore.kernel.org/linux-mm/20240202170238.90004-1-gregory.price@memverge.com/
Stream Benchmark (vs DRAM, 1 Socket + 1 CXL Device)
Default interleave : -78% (slower than DRAM)
Global weighting : -6% to +4% (workload dependant)
Targeted weights : +2.5% to +4% (consistently better than DRAM)
Just some context
~Gregory
Powered by blists - more mailing lists