linux-kernel - Re: [RFC] mm: Proactive compaction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7bbd5322ed7a7fcb349c83952f8fc17448cd07d8.camel@nvidia.com>
Date:   Thu, 19 Sep 2019 23:37:34 +0000
From:   Nitin Gupta <nigupta@...dia.com>
To:     "dan.j.williams@...el.com" <dan.j.williams@...el.com>,
        "mhocko@...e.com" <mhocko@...e.com>,
        "mgorman@...hsingularity.net" <mgorman@...hsingularity.net>,
        "vbabka@...e.cz" <vbabka@...e.cz>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
CC:     "cai@....pw" <cai@....pw>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "aryabinin@...tuozzo.com" <aryabinin@...tuozzo.com>,
        "jannh@...gle.com" <jannh@...gle.com>, "guro@...com" <guro@...com>,
        "hannes@...xchg.org" <hannes@...xchg.org>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "yuzhao@...gle.com" <yuzhao@...gle.com>,
        "arunks@...eaurora.org" <arunks@...eaurora.org>,
        "willy@...radead.org" <willy@...radead.org>,
        "khalid.aziz@...cle.com" <khalid.aziz@...cle.com>,
        "janne.huttunen@...ia.com" <janne.huttunen@...ia.com>,
        "khlebnikov@...dex-team.ru" <khlebnikov@...dex-team.ru>
Subject: Re: [RFC] mm: Proactive compaction

On Tue, 2019-08-20 at 10:46 +0200, Vlastimil Babka wrote:
> > This patch is largely based on ideas from Michal Hocko posted here:
> > https://lore.kernel.org/linux-mm/20161230131412.GI13301@dhcp22.suse.cz/
> > 
> > Testing done (on x86):
> >   - Set /sys/kernel/mm/compaction/order-9/extfrag_{low,high} = {25, 30}
> >   respectively.
> >   - Use a test program to fragment memory: the program allocates all
> > memory
> >   and then for each 2M aligned section, frees 3/4 of base pages using
> >   munmap.
> >   - kcompactd0 detects fragmentation for order-9 > extfrag_high and starts
> >   compaction till extfrag < extfrag_low for order-9.
> > 
> > The patch has plenty of rough edges but posting it early to see if I'm
> > going in the right direction and to get some early feedback.
> 
> That's a lot of control knobs - how is an admin supposed to tune them to
> their
> needs?


Yes, it's difficult for an admin to get so many tunable right unless
targeting a very specific workload.

How about a simpler solution where we exposed just one tunable per-node:
   /sys/.../node-x/compaction_effort
which accepts [0, 100]

This parallels /proc/sys/vm/swappiness but for compaction. With this
single number, we can estimate per-order [low, high] watermarks for external
fragmentation like this:
 - For now, map this range to [low, medium, high] which correponds to specific
low, high thresholds for extfrag.
 - Apply more relaxed thresholds for higher-order than for lower orders.

With this single tunable we remove the burden of setting per-order explicit
[low, high] thresholds and it should be easier to experiment with.

-Nitin