lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240717071257.4141363-1-ryan.roberts@arm.com>
Date: Wed, 17 Jul 2024 08:12:52 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Andrew Morton <akpm@...ux-foundation.org>,
	Hugh Dickins <hughd@...gle.com>,
	Jonathan Corbet <corbet@....net>,
	"Matthew Wilcox (Oracle)" <willy@...radead.org>,
	David Hildenbrand <david@...hat.com>,
	Barry Song <baohua@...nel.org>,
	Lance Yang <ioworker0@...il.com>,
	Baolin Wang <baolin.wang@...ux.alibaba.com>,
	Gavin Shan <gshan@...hat.com>,
	Pankaj Raghav <kernel@...kajraghav.com>,
	Daniel Gomez <da.gomez@...sung.com>
Cc: Ryan Roberts <ryan.roberts@....com>,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: [RFC PATCH v1 0/4] Control folio sizes used for page cache memory

Hi All,

This series is an RFC that adds sysfs and kernel cmdline controls to configure
the set of allowed large folio sizes that can be used when allocating
file-memory for the page cache. As part of the control mechanism, it provides
for a special-case "preferred folio size for executable mappings" marker.

I'm trying to solve 2 separate problems with this series:

1. Reduce pressure in iTLB and improve performance on arm64: This is a modified
approach for the change at [1]. Instead of hardcoding the preferred executable
folio size into the arch, user space can now select it. This decouples the arch
code and also makes the mechanism more generic; it can be bypassed (the default)
or any folio size can be set. For my use case, 64K is preferred, but I've also
heard from Willy of a use case where putting all text into 2M PMD-sized folios
is preferred. This approach avoids the need for synchonous MADV_COLLAPSE (and
therefore faulting in all text ahead of time) to achieve that.

2. Reduce memory fragmentation in systems under high memory pressure (e.g.
Android): The theory goes that if all folios are 64K, then failure to allocate a
64K folio should become unlikely. But if the page cache is allocating lots of
different orders, with most allocations having an order below 64K (as is the
case today) then ability to allocate 64K folios diminishes. By providing control
over the allowed set of folio sizes, we can tune to avoid crucial 64K folio
allocation failure. Additionally I've heard (second hand) of the need to disable
large folios in the page cache entirely due to latency concerns in some
settings. These controls allow all of this without kernel changes.

The value of (1) is clear and the performance improvements are documented in
patch 2. I don't yet have any data demonstrating the theory for (2) since I
can't reproduce the setup that Barry had at [2]. But my view is that by adding
these controls we will enable the community to explore further, in the same way
that the anon mTHP controls helped harden the understanding for anonymous
memory.

---
This series depends on the "mTHP allocation stats for file-backed memory" series
at [3], which itself applies on top of yesterday's mm-unstable (650b6752c8a3). All
mm selftests have been run; no regressions were observed.

[1] https://lore.kernel.org/linux-mm/20240215154059.2863126-1-ryan.roberts@arm.com/
[2] https://www.youtube.com/watch?v=ht7eGWqwmNs&list=PLbzoR-pLrL6oj1rVTXLnV7cOuetvjKn9q&index=4
[3] https://lore.kernel.org/linux-mm/20240716135907.4047689-1-ryan.roberts@arm.com/

Thanks,
Ryan

Ryan Roberts (4):
  mm: mTHP user controls to configure pagecache large folio sizes
  mm: Introduce "always+exec" for mTHP file_enabled control
  mm: Override mTHP "enabled" defaults at kernel cmdline
  mm: Override mTHP "file_enabled" defaults at kernel cmdline

 .../admin-guide/kernel-parameters.txt         |  16 ++
 Documentation/admin-guide/mm/transhuge.rst    |  66 +++++++-
 include/linux/huge_mm.h                       |  61 ++++---
 mm/filemap.c                                  |  26 ++-
 mm/huge_memory.c                              | 158 +++++++++++++++++-
 mm/readahead.c                                |  43 ++++-
 6 files changed, 329 insertions(+), 41 deletions(-)

--
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ