lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BYAPR12MB3015C0746B50A129FCDD9E47D8490@BYAPR12MB3015.namprd12.prod.outlook.com>
Date:   Fri, 22 Nov 2019 22:31:12 +0000
From:   Nitin Gupta <nigupta@...dia.com>
To:     David Rientjes <rientjes@...gle.com>
CC:     "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "vbabka@...e.cz" <vbabka@...e.cz>,
        "mgorman@...hsingularity.net" <mgorman@...hsingularity.net>,
        "mhocko@...e.com" <mhocko@...e.com>,
        "dan.j.williams@...el.com" <dan.j.williams@...el.com>,
        Yu Zhao <yuzhao@...gle.com>,
        Matthew Wilcox <willy@...radead.org>, Qian Cai <cai@....pw>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Roman Gushchin <guro@...com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Kees Cook <keescook@...omium.org>,
        Jann Horn <jannh@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Arun KS <arunks@...eaurora.org>,
        Janne Huttunen <janne.huttunen@...ia.com>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: RE: [RFC] mm: Proactive compaction



> -----Original Message-----
> From: owner-linux-mm@...ck.org <owner-linux-mm@...ck.org> On Behalf
> Of David Rientjes
> Sent: Monday, September 16, 2019 1:17 PM
> To: Nitin Gupta <nigupta@...dia.com>
> Cc: akpm@...ux-foundation.org; vbabka@...e.cz;
> mgorman@...hsingularity.net; mhocko@...e.com;
> dan.j.williams@...el.com; Yu Zhao <yuzhao@...gle.com>; Matthew Wilcox
> <willy@...radead.org>; Qian Cai <cai@....pw>; Andrey Ryabinin
> <aryabinin@...tuozzo.com>; Roman Gushchin <guro@...com>; Greg Kroah-
> Hartman <gregkh@...uxfoundation.org>; Kees Cook
> <keescook@...omium.org>; Jann Horn <jannh@...gle.com>; Johannes
> Weiner <hannes@...xchg.org>; Arun KS <arunks@...eaurora.org>; Janne
> Huttunen <janne.huttunen@...ia.com>; Konstantin Khlebnikov
> <khlebnikov@...dex-team.ru>; linux-kernel@...r.kernel.org; linux-
> mm@...ck.org
> Subject: Re: [RFC] mm: Proactive compaction
> 
> On Fri, 16 Aug 2019, Nitin Gupta wrote:
> 
> > For some applications we need to allocate almost all memory as
> > hugepages. However, on a running system, higher order allocations can
> > fail if the memory is fragmented. Linux kernel currently does
> > on-demand compaction as we request more hugepages but this style of
> > compaction incurs very high latency. Experiments with one-time full
> > memory compaction (followed by hugepage allocations) shows that kernel
> > is able to restore a highly fragmented memory state to a fairly
> > compacted memory state within <1 sec for a 32G system. Such data
> > suggests that a more proactive compaction can help us allocate a large
> > fraction of memory as hugepages keeping allocation latencies low.
> >
> > For a more proactive compaction, the approach taken here is to define
> > per page-order external fragmentation thresholds and let kcompactd
> > threads act on these thresholds.
> >
> > The low and high thresholds are defined per page-order and exposed
> > through sysfs:
> >
> >   /sys/kernel/mm/compaction/order-[1..MAX_ORDER]/extfrag_{low,high}
> >
> > Per-node kcompactd thread is woken up every few seconds to check if
> > any zone on its node has extfrag above the extfrag_high threshold for
> > any order, in which case the thread starts compaction in the backgrond
> > till all zones are below extfrag_low level for all orders. By default
> > both these thresolds are set to 100 for all orders which essentially
> > disables kcompactd.
> >
> > To avoid wasting CPU cycles when compaction cannot help, such as when
> > memory is full, we check both, extfrag > extfrag_high and
> > compaction_suitable(zone). This allows kcomapctd thread to stays
> > inactive even if extfrag thresholds are not met.
> >
> > This patch is largely based on ideas from Michal Hocko posted here:
> > https://lore.kernel.org/linux-
> mm/20161230131412.GI13301@...p22.suse.cz
> > /
> >
> > Testing done (on x86):
> >  - Set /sys/kernel/mm/compaction/order-9/extfrag_{low,high} = {25, 30}
> > respectively.
> >  - Use a test program to fragment memory: the program allocates all
> > memory  and then for each 2M aligned section, frees 3/4 of base pages
> > using  munmap.
> >  - kcompactd0 detects fragmentation for order-9 > extfrag_high and
> > starts  compaction till extfrag < extfrag_low for order-9.
> >
> > The patch has plenty of rough edges but posting it early to see if I'm
> > going in the right direction and to get some early feedback.
> >
> 
> Is there an update to this proposal or non-RFC patch that has been posted
> for proactive compaction?
> 

I recently posted a non-RFC patch for proactive compaction:

https://lkml.org/lkml/2019/11/15/1099

Please let me know if you try it out or if you have any feedback.

Thanks,
Nitin



> We've had good success with periodically compacting memory on a regular
> cadence on systems with hugepages enabled.  The cadence itself is defined
> by the admin but it causes khugepaged[*] to periodically wakeup and invoke
> compaction in an attempt to keep zones as defragmented as possible
> (perhaps more "proactive" than what is proposed here in an attempt to keep
> all memory as unfragmented as possible regardless of extfrag thresholds).
> It also avoids corner-cases where kcompactd could become more expensive
> than what is anticipated because it is unsuccessful at compacting memory yet
> the extfrag threshold is still exceeded.
> 
>  [*] Khugepaged instead of kcompactd only because this is only enabled
>      for systems where transparent hugepages are enabled, probably better
>      off in kcompactd to avoid duplicating work between two kthreads if
>      there is already a need for background compaction.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ