linux-kernel - Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181127182137.GE6923@dhcp22.suse.cz>
Date:   Tue, 27 Nov 2018 19:21:37 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     rong.a.chen@...el.com, Andrea Arcangeli <aarcange@...hat.com>,
        s.priebe@...fihost.ag, alex.williamson@...hat.com,
        mgorman@...hsingularity.net, zi.yan@...rutgers.edu,
        Vlastimil Babka <vbabka@...e.cz>, rientjes@...gle.com,
        kirill@...temov.name, Andrew Morton <akpm@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        lkp@...org
Subject: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3%
 regression

On Tue 27-11-18 19:17:27, Michal Hocko wrote:
> On Tue 27-11-18 09:08:50, Linus Torvalds wrote:
> > On Mon, Nov 26, 2018 at 10:24 PM kernel test robot
> > <rong.a.chen@...el.com> wrote:
> > >
> > > FYI, we noticed a -61.3% regression of vm-scalability.throughput due
> > > to commit ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for
> > > MADV_HUGEPAGE mappings")
> > 
> > Well, that's certainly noticeable and not good.
> > 
> > Andrea, I suspect it might be causing fights with auto numa migration..
> > 
> > Lots more system time, but also look at this:
> > 
> > >    1122389 ±  9%     +17.2%    1315380 ±  4%  proc-vmstat.numa_hit
> > >     214722 ±  5%     +21.6%     261076 ±  3%  proc-vmstat.numa_huge_pte_updates
> > >    1108142 ±  9%     +17.4%    1300857 ±  4%  proc-vmstat.numa_local
> > >     145368 ± 48%     +63.1%     237050 ± 17%  proc-vmstat.numa_miss
> > >     159615 ± 44%     +57.6%     251573 ± 16%  proc-vmstat.numa_other
> > >     185.50 ± 81%   +8278.6%      15542 ± 40%  proc-vmstat.numa_pages_migrated
> > 
> > Should the commit be reverted? Or perhaps at least modified?
> 
> Well, the commit is trying to revert to the behavior before
> 5265047ac301 because there are real usecases that suffered from that
> change and bug reports as a result of that.
> 
> will-it-scale is certainly worth considering but it is an artificial
> testcase. A higher NUMA miss rate is an expected side effect of the
> patch because the fallback to a different NUMA node is more likely. The
> __GFP_THISNODE side effect is basically introducing node-reclaim
> behavior for THPages. Another thing is that there is no good behavior
> for everybody. Reclaim locally vs. THP on a remote node is hard to
> tell by default. We have discussed that at length and there were some
> conclusions. One of them is that we need a numa policy to tell whether
> a expensive localility is preferred over remote allocation.  Also we
> definitely need a better pro-active defragmentation to allow larger
> pages on a local node. This is a work in progress and this patch is a
> stop gap fix.

Btw. the associated discussion is http://lkml.kernel.org/r/20180925120326.24392-1-mhocko@kernel.org

-- 
Michal Hocko
SUSE Labs