linux-kernel - Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound nodes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201106155503.nkwuxr5mkneggzl7@intel.com>
Date:   Fri, 6 Nov 2020 07:55:03 -0800
From:   Ben Widawsky <ben.widawsky@...el.com>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Mel Gorman <mgorman@...e.de>,
        Peter Zijlstra <peterz@...radead.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ingo Molnar <mingo@...hat.com>, Rik van Riel <riel@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound
 nodes

On 20-11-06 15:28:59, Huang, Ying wrote:
> Mel Gorman <mgorman@...e.de> writes:
> 
> > On Wed, Nov 04, 2020 at 01:36:58PM +0800, Huang, Ying wrote:
> >> But from another point of view, I suggest to remove the constraints of
> >> MPOL_F_MOF in the future.  If the overhead of AutoNUMA isn't acceptable,
> >> why not just disable AutoNUMA globally via sysctl knob?
> >> 
> >
> > Because it's a double edged sword. NUMA Balancing can make a workload
> > faster while still incurring more overhead than it should -- particularly
> > when threads are involved rescanning the same or unrelated regions.
> > Global disabling only really should happen when an application is running
> > that is the only application on the machine and has full NUMA awareness.
> 
> Got it.  So NUMA Balancing may in generally benefit some workloads but
> hurt some other workloads on one machine.  So we need a method to
> enable/disable NUMA Balancing for one workload.  Previously, this is
> done via the explicit NUMA policy.  If some explicit NUMA policy is
> specified, NUMA Balancing is disabled for the memory region or the
> thread.  And this can be reverted again for a memory region via
> MPOL_MF_LAZY.  It appears that we lacks MPOL_MF_LAZY for the thread yet.
> 
> >> > It might still end up being better but I was not aware of a
> >> > *realistic* workload that binds to multiple nodes
> >> > deliberately. Generally I expect if an application is binding, it's
> >> > binding to one local node.
> >> 
> >> Yes.  It's not popular configuration for now.  But for the memory
> >> tiering system with both DRAM and PMEM, the DRAM and PMEM in one socket
> >> will become 2 NUMA nodes.  To avoid too much cross-socket memory
> >> accessing, but take advantage of both the DRAM and PMEM, the workload
> >> can be bound to 2 NUMA nodes (DRAM and PMEM).
> >> 
> >
> > Ok, that may lead to unpredictable performance as it'll have variable
> > performance with limited control of the "important" applications that
> > should use DRAM over PMEM. That's a long road but the step is not
> > incompatible with the long-term goal.
> 
> Yes.  Ben Widawsky is working on a patchset to make it possible to
> prefer the remote DRAM instead of the local PMEM as follows,
> 
> https://lore.kernel.org/linux-mm/20200630212517.308045-1-ben.widawsky@intel.com/
> 
> Best Regards,
> Huang, Ying
> 

Rebased version was posted here:
https://lore.kernel.org/linux-mm/20201030190238.306764-1-ben.widawsky@intel.com/

Thanks.
Ben