lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 12 Dec 2018 10:50:51 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     David Rientjes <rientjes@...gle.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        mgorman@...hsingularity.net, Vlastimil Babka <vbabka@...e.cz>,
        ying.huang@...el.com, s.priebe@...fihost.ag,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        alex.williamson@...hat.com, lkp@...org, kirill@...temov.name,
        Andrew Morton <akpm@...ux-foundation.org>,
        zi.yan@...rutgers.edu
Subject: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3%
 regression

On Tue 11-12-18 16:37:22, David Rientjes wrote:
[...]
> Since it depends on the workload, specifically workloads that fit within a 
> single node, I think the reasonable approach would be to have a sane 
> default regardless of the use of MADV_HUGEPAGE or thp defrag settings and 
> then optimzie for the minority of cases where the workload does not fit in 
> a single node.  I'm assuming there is no debate about these larger 
> workloads being in the minority, although we have single machines where 
> this encompasses the totality of their workloads.

Your assumption is wrong I believe. This is the fundamental disagreement
we are discussing here. You are essentially arguing for node_reclaim
(formerly zone_reclaim) behavior for THP pages. All that without any
actual data on wider variety of workloads. As the matter of _fact_ we
know that node_reclaim behavior is not a suitable default. We did
that mistake in the past and we had to revert that default _exactly_
because a wider variety of workloads suffered from over reclaim and
performance issues as a result of constant reclaim

You have also haven't explained why you do care so much about remote THP
while you do not care about remote base bages (the page allocator
falls back to those as soon as the kswapd doesn't keep pace with the
allocation rate; THP or high order pages in general is analoguous with
kcompactd doing a pro-active compaction). Like the base pages we do not
want larger pages to fallback to a remote node too easily. There is no
question about that I believe.

I can be convinced that larger pages really require a different behavior
than base pages but you should better show _real_ numbers on a wider
variety workloads to back your claims. I have only heard hand waving and
a very vague and quite doubtful numbers for a non-disclosed benchmark
without a clear indication on how it relates to real world workloads. So
color me unconvinced.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ