lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Jan 2024 17:03:17 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Zi Yan <ziy@...dia.com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: "Huang, Ying" <ying.huang@...el.com>, Ryan Roberts
 <ryan.roberts@....com>, Andrew Morton <akpm@...ux-foundation.org>,
 "Matthew Wilcox (Oracle)" <willy@...radead.org>,
 David Hildenbrand <david@...hat.com>, "Yin, Fengwei"
 <fengwei.yin@...el.com>, Yu Zhao <yuzhao@...gle.com>,
 Vlastimil Babka <vbabka@...e.cz>,
 "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
 Johannes Weiner <hannes@...xchg.org>, Kemeng Shi
 <shikemeng@...weicloud.com>, Mel Gorman <mgorman@...hsingularity.net>,
 Rohan Puri <rohan.puri15@...il.com>, Mcgrof Chamberlain <mcgrof@...nel.org>,
 Adam Manzanares <a.manzanares@...sung.com>,
 "Vishal Moola (Oracle)" <vishal.moola@...il.com>
Subject: Re: [PATCH v2 0/3] Enable >0 order folio memory compaction



On 1/23/2024 11:46 AM, Zi Yan wrote:
> From: Zi Yan <ziy@...dia.com>
> 
> Hi all,
> 
> This patchset enables >0 order folio memory compaction, which is one of
> the prerequisitions for large folio support[1]. It is on top of
> mm-everything-2024-01-18-22-21.
> 
> I am aware of that split free pages is necessary for folio
> migration in compaction, since if >0 order free pages are never split
> and no order-0 free page is scanned, compaction will end prematurely due
> to migration returns -ENOMEM. Free page split becomes a must instead of
> an optimization.
> 
> Some applications from vm-scalability show different performance trends
> on default LRU and CONFIG_LRU_GEN from patch 1 (split folio during compaction),
> to patch 2 (folio migration during compaction), to patch 3 (folio
> migration during compaction with free page split). I am looking into it.
> 
> lkp ncompare results (with >5% delta) for default LRU and CONFIG_LRU_GEN are
> shown at the bottom (on a 8-CPU (Intel Xeon E5-2650 v4 @ 2.20GHz) 16G VM).

Overall, I haven't found any obvious issues, thanks for your work. 
However I got some percentage regression when running thpcompact on my 
machine(16 cores and 120G memory) without enabling mTHP:
                                  k6.8-rc1               k6.8-rc1-patched
Percentage huge-1        86.19 (   0.00%)       51.17 ( -40.63%)
Percentage huge-3        93.64 (   0.00%)       42.48 ( -54.64%)
Percentage huge-5        94.93 (   0.00%)       31.06 ( -67.28%)
Percentage huge-7        95.40 (   0.00%)       19.09 ( -79.99%)
Percentage huge-12       93.51 (   0.00%)       32.06 ( -65.71%)
Percentage huge-18       83.02 (   0.00%)       54.58 ( -34.26%)
Percentage huge-24       83.17 (   0.00%)       49.61 ( -40.35%)
Percentage huge-30       96.69 (   0.00%)       59.82 ( -38.13%)
Percentage huge-32       95.52 (   0.00%)       59.20 ( -38.03%)

Ops Compaction stalls                 229710.00      554846.00
Ops Compaction success                144177.00        9351.00
Ops Compaction failures                85533.00      545495.00
Ops Compaction efficiency                 62.76           1.69
Ops Page migrate success            60333689.00    11687573.00
Ops Page migrate failure               25818.00      459621.00
Ops Compaction pages isolated      127723211.00   224420997.00
Ops Compaction migrate scanned     142498744.00   173345194.00
Ops Compaction free scanned       1159752360.00   624633726.00
Ops Compact scan efficiency               12.29          27.75
Ops Compaction cost                    66050.96       17615.55

I did not have time to analyze this issue, just providing you some test 
information. And I will measure the compaction efficiency of mTHP if I 
find some time.

>  From V1 [2]:
> 1. Used folio_test_large() instead of folio_order() > 0. (per Matthew
> Wilcox)
> 
> 2. Fixed code rebase error. (per Baolin Wang)
> 
> 3. Used list_split_init() instead of list_split(). (per Ryan Boberts)
> 
> 4. Added free_pages_prepare_fpi_none() to avoid duplicate free page code
> in compaction_free().
> 
> 5. Dropped source page order sorting patch.
> 
>  From RFC [1]:
> 1. Enabled >0 order folio compaction in the first patch by splitting all
> to-be-migrated folios. (per Huang, Ying)
> 
> 2. Stopped isolating compound pages with order greater than cc->order
> to avoid wasting effort, since cc->order gives a hint that no free pages
> with order greater than it exist, thus migrating the compound pages will fail.
> (per Baolin Wang)
> 
> 3. Retained the folio check within lru lock. (per Baolin Wang)
> 
> 4. Made isolate_freepages_block() generate order-sorted multi lists.
> (per Johannes Weiner)
> 
> Overview
> ===
> 
> To support >0 order folio compaction, the patchset changes how free pages used
> for migration are kept during compaction. Free pages used to be split into
> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared,
> page order stored in page->private is zeroed, and page reference is set to 1).
> Now all free pages are kept in a MAX_ORDER+1 array of page lists based
> on their order without post allocation process. When migrate_pages() asks for
> a new page, one of the free pages, based on the requested page order, is
> then processed and given out.
> 
> 
> Feel free to give comments and ask questions.
> 
> Thanks.
> 
> [1] https://lore.kernel.org/linux-mm/20230912162815.440749-1-zi.yan@sent.com/
> [2] https://lore.kernel.org/linux-mm/20231113170157.280181-1-zi.yan@sent.com/
> 
> vm-scalability results on CONFIG_LRU_GEN
> ===
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/small-allocs/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>     2024326           +35.5%    2743772 ± 41%    +364.0%    9392198 ± 35%     +31.0%    2651634        vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/small-allocs-mt/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>     1450189            +0.9%    1463418           +30.4%    1891610 ± 22%      +0.3%    1454100        vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/mmap-xread-seq-mt/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>    14428848 ± 27%     -51.7%    6963308 ± 73%     +13.5%   16372621           +11.2%   16046511        vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>    13569502 ± 24%     -45.9%    7340064 ± 59%     +12.3%   15240531           +10.4%   14983705        vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq-mt/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>    13305823 ± 24%     -45.1%    7299664 ± 56%     +12.5%   14974725           +10.4%   14695963        vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>    13244376 ± 28%     +54.2%   20425838 ± 23%      -4.4%   12660113 ±  3%      -9.0%   12045809 ±  3%  vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>     7021425 ± 11%     -20.9%    5556751 ± 19%     +14.8%    8057811 ±  3%      +9.4%    7678613 ±  4%  vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/256G/qemu-vm/msync/vm-scalability
> 
> commit:
>    6.7.0-rc4+
>    6.7.0-rc4-split-folio-in-compaction+
>    6.7.0-rc4-folio-migration-in-compaction+
>    6.7.0-rc4-folio-migration-free-page-split+
> 
>        6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>     1208994 ±137%    +263.5%    4394683 ± 49%     -49.4%     611204 ±  6%     -48.1%     627937 ± 13%  vm-scalability.throughput
> 
> 
> 
> vm-scalability results on default LRU (with -no-mglru suffix)
> ===
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability
> 
> commit:
>    6.7.0-rc4-no-mglru+
>    6.7.0-rc4-split-folio-in-compaction-no-mglru+
>    6.7.0-rc4-folio-migration-in-compaction-no-mglru+
>    6.7.0-rc4-folio-migration-free-page-split-no-mglru+
> 
> 6.7.0-rc4-no-mgl 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>     8412072 ±  3%     +32.1%   11114537 ± 41%      +3.5%    8703491 ±  3%      +1.5%    8536343 ±  3%  vm-scalability.throughput
> 
> =========================================================================================
> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
>    gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability
> 
> commit:
>    6.7.0-rc4-no-mglru+
>    6.7.0-rc4-split-folio-in-compaction-no-mglru+
>    6.7.0-rc4-folio-migration-in-compaction-no-mglru+
>    6.7.0-rc4-folio-migration-free-page-split-no-mglru+
> 
> 6.7.0-rc4-no-mgl 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f
> ---------------- --------------------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \          |                \
>     7095358           +10.8%    7863635 ± 16%      +5.5%    7484110            +1.5%    7200666 ±  4%  vm-scalability.throughput
> 
> 
> Zi Yan (3):
>    mm/compaction: enable compacting >0 order folios.
>    mm/compaction: add support for >0 order folio memory compaction.
>    mm/compaction: optimize >0 order folio compaction with free page
>      split.
> 
>   mm/compaction.c | 218 ++++++++++++++++++++++++++++++++++--------------
>   mm/internal.h   |   9 +-
>   mm/page_alloc.c |   6 ++
>   3 files changed, 169 insertions(+), 64 deletions(-)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ