[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7bbd5322ed7a7fcb349c83952f8fc17448cd07d8.camel@nvidia.com>
Date: Thu, 19 Sep 2019 23:37:34 +0000
From: Nitin Gupta <nigupta@...dia.com>
To: "dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"mhocko@...e.com" <mhocko@...e.com>,
"mgorman@...hsingularity.net" <mgorman@...hsingularity.net>,
"vbabka@...e.cz" <vbabka@...e.cz>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
CC: "cai@....pw" <cai@....pw>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"aryabinin@...tuozzo.com" <aryabinin@...tuozzo.com>,
"jannh@...gle.com" <jannh@...gle.com>, "guro@...com" <guro@...com>,
"hannes@...xchg.org" <hannes@...xchg.org>,
"keescook@...omium.org" <keescook@...omium.org>,
"yuzhao@...gle.com" <yuzhao@...gle.com>,
"arunks@...eaurora.org" <arunks@...eaurora.org>,
"willy@...radead.org" <willy@...radead.org>,
"khalid.aziz@...cle.com" <khalid.aziz@...cle.com>,
"janne.huttunen@...ia.com" <janne.huttunen@...ia.com>,
"khlebnikov@...dex-team.ru" <khlebnikov@...dex-team.ru>
Subject: Re: [RFC] mm: Proactive compaction
On Tue, 2019-08-20 at 10:46 +0200, Vlastimil Babka wrote:
> > This patch is largely based on ideas from Michal Hocko posted here:
> > https://lore.kernel.org/linux-mm/20161230131412.GI13301@dhcp22.suse.cz/
> >
> > Testing done (on x86):
> > - Set /sys/kernel/mm/compaction/order-9/extfrag_{low,high} = {25, 30}
> > respectively.
> > - Use a test program to fragment memory: the program allocates all
> > memory
> > and then for each 2M aligned section, frees 3/4 of base pages using
> > munmap.
> > - kcompactd0 detects fragmentation for order-9 > extfrag_high and starts
> > compaction till extfrag < extfrag_low for order-9.
> >
> > The patch has plenty of rough edges but posting it early to see if I'm
> > going in the right direction and to get some early feedback.
>
> That's a lot of control knobs - how is an admin supposed to tune them to
> their
> needs?
Yes, it's difficult for an admin to get so many tunable right unless
targeting a very specific workload.
How about a simpler solution where we exposed just one tunable per-node:
/sys/.../node-x/compaction_effort
which accepts [0, 100]
This parallels /proc/sys/vm/swappiness but for compaction. With this
single number, we can estimate per-order [low, high] watermarks for external
fragmentation like this:
- For now, map this range to [low, medium, high] which correponds to specific
low, high thresholds for extfrag.
- Apply more relaxed thresholds for higher-order than for lower orders.
With this single tunable we remove the burden of setting per-order explicit
[low, high] thresholds and it should be easier to experiment with.
-Nitin
Powered by blists - more mailing lists