[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201106155503.nkwuxr5mkneggzl7@intel.com>
Date: Fri, 6 Nov 2020 07:55:03 -0800
From: Ben Widawsky <ben.widawsky@...el.com>
To: "Huang, Ying" <ying.huang@...el.com>
Cc: Mel Gorman <mgorman@...e.de>,
Peter Zijlstra <peterz@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...hat.com>, Rik van Riel <riel@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Dave Hansen <dave.hansen@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Michal Hocko <mhocko@...e.com>,
David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound
nodes
On 20-11-06 15:28:59, Huang, Ying wrote:
> Mel Gorman <mgorman@...e.de> writes:
>
> > On Wed, Nov 04, 2020 at 01:36:58PM +0800, Huang, Ying wrote:
> >> But from another point of view, I suggest to remove the constraints of
> >> MPOL_F_MOF in the future. If the overhead of AutoNUMA isn't acceptable,
> >> why not just disable AutoNUMA globally via sysctl knob?
> >>
> >
> > Because it's a double edged sword. NUMA Balancing can make a workload
> > faster while still incurring more overhead than it should -- particularly
> > when threads are involved rescanning the same or unrelated regions.
> > Global disabling only really should happen when an application is running
> > that is the only application on the machine and has full NUMA awareness.
>
> Got it. So NUMA Balancing may in generally benefit some workloads but
> hurt some other workloads on one machine. So we need a method to
> enable/disable NUMA Balancing for one workload. Previously, this is
> done via the explicit NUMA policy. If some explicit NUMA policy is
> specified, NUMA Balancing is disabled for the memory region or the
> thread. And this can be reverted again for a memory region via
> MPOL_MF_LAZY. It appears that we lacks MPOL_MF_LAZY for the thread yet.
>
> >> > It might still end up being better but I was not aware of a
> >> > *realistic* workload that binds to multiple nodes
> >> > deliberately. Generally I expect if an application is binding, it's
> >> > binding to one local node.
> >>
> >> Yes. It's not popular configuration for now. But for the memory
> >> tiering system with both DRAM and PMEM, the DRAM and PMEM in one socket
> >> will become 2 NUMA nodes. To avoid too much cross-socket memory
> >> accessing, but take advantage of both the DRAM and PMEM, the workload
> >> can be bound to 2 NUMA nodes (DRAM and PMEM).
> >>
> >
> > Ok, that may lead to unpredictable performance as it'll have variable
> > performance with limited control of the "important" applications that
> > should use DRAM over PMEM. That's a long road but the step is not
> > incompatible with the long-term goal.
>
> Yes. Ben Widawsky is working on a patchset to make it possible to
> prefer the remote DRAM instead of the local PMEM as follows,
>
> https://lore.kernel.org/linux-mm/20200630212517.308045-1-ben.widawsky@intel.com/
>
> Best Regards,
> Huang, Ying
>
Rebased version was posted here:
https://lore.kernel.org/linux-mm/20201030190238.306764-1-ben.widawsky@intel.com/
Thanks.
Ben
Powered by blists - more mailing lists