lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1661461643.git.alexlzhu@fb.com>
Date:   Thu, 25 Aug 2022 14:30:51 -0700
From:   <alexlzhu@...com>
To:     <linux-mm@...ck.org>
CC:     <willy@...radead.org>, <hannes@...xchg.org>,
        <akpm@...ux-foundation.org>, <riel@...riel.com>,
        <kernel-team@...com>, <linux-kernel@...r.kernel.org>,
        Alexander Zhu <alexlzhu@...com>
Subject: [RFC 0/3] THP Shrinker

From: Alexander Zhu <alexlzhu@...com>

Transparent Hugepages use a larger page size of 2MB in comparison to
normal sized pages that are 4kb. A larger page size allows for fewer TLB
cache misses and thus more efficient use of the CPU. Using a larger page
size also results in more memory waste, which can hurt performance in some
use cases. THPs are currently enabled in the Linux Kernel by applications
in limited virtual address ranges via the madvise system call.  The THP
shrinker tries to find a balance between increased use of THPs, and
increased use of memory. It shrinks the size of memory by removing the
underutilized THPs that are identified by the thp_utilization scanner. 

In our experiments we have noticed that the least utilized THPs are almost
entirely unutilized.

Sample Output: 

Utilized[0-50]: 1331 680884
Utilized[51-101]: 9 3983
Utilized[102-152]: 3 1187
Utilized[153-203]: 0 0
Utilized[204-255]: 2 539
Utilized[256-306]: 5 1135
Utilized[307-357]: 1 192
Utilized[358-408]: 0 0
Utilized[409-459]: 1 57
Utilized[460-512]: 400 13
Last Scan Time: 223.98
Last Scan Duration: 70.65

Above is a sample obtained from one of our test machines when THP is always
enabled. Of the 1331 THPs in this thp_utilization sample that have from
0-50 utilized subpages, we see that there are 680884 free pages. This
comes out to 680884 / (512 * 1331) = 99.91% zero pages in the least
utilized bucket. This represents 680884 * 4KB = 2.7GB memory waste.

Also note that the vast majority of pages are either in the least utilized
[0-50] or most utilized [460-512] buckets. The least utilized THPs are 
responsible for almost all of the memory waste when THP is always 
enabled. Thus by clearing out THPs in the lowest utilization bucket
we extract most of the improvement in CPU efficiency. We have seen 
similar results on our production hosts.

This patchset introduces the THP shrinker we have developed to identify
and split the least utilized THPs. It includes the thp_utilization 
changes that groups anonymous THPs into buckets, the split_huge_page()
changes that identify and zap zero 4KB pages within THPs and the shrinker
changes. It should be noted that the split_huge_page() changes are based
off previous work done by Yu Zhao. 

In the future, we intend to allow additional tuning to the shrinker
based on workload depending on CPU/IO/Memory pressure and the 
amount of anonymous memory. The long term goal is to eventually always 
enable THP for all applications and deprecate madvise entirely.

Alexander Zhu (3):
  mm: add thp_utilization metrics to debugfs
  mm: changes to split_huge_page() to free zero filled tail pages
  mm: THP low utilization shrinker

 Documentation/admin-guide/mm/transhuge.rst    |   9 +
 include/linux/huge_mm.h                       |   9 +
 include/linux/list_lru.h                      |  24 ++
 include/linux/mm_types.h                      |   5 +
 include/linux/rmap.h                          |   2 +-
 include/linux/vm_event_item.h                 |   2 +
 mm/huge_memory.c                              | 333 +++++++++++++++++-
 mm/list_lru.c                                 |  49 +++
 mm/migrate.c                                  |  60 +++-
 mm/migrate_device.c                           |   4 +-
 mm/page_alloc.c                               |   6 +
 mm/vmstat.c                                   |   2 +
 .../selftests/vm/split_huge_page_test.c       |  58 ++-
 tools/testing/selftests/vm/vm_util.c          |  23 ++
 tools/testing/selftests/vm/vm_util.h          |   1 +
 15 files changed, 569 insertions(+), 18 deletions(-)

-- 
2.30.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ