lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHTA-uav+jc9WKr+Ye0zoR+niczZLbTKdX1LisR3YSLtoLJ5Dw@mail.gmail.com>
Date: Thu, 24 Apr 2025 13:10:49 -0500
From: Mitchell Augustin <mitchell.augustin@...onical.com>
To: akpm@...ux-foundation.org, 
	20250211152341.3431089327c5e0ec6ba6064d@...ux-foundation.org
Cc: 21cnbao@...il.com, aneesh.kumar@...nel.org, anshuman.khandual@....com, 
	apopple@...dia.com, baohua@...nel.org, catalin.marinas@....com, cl@...two.org, 
	dave.hansen@...ux.intel.com, david@...hat.com, dev.jain@....com, 
	haowenchao22@...il.com, hughd@...gle.com, ioworker0@...il.com, jack@...e.cz, 
	jglisse@...gle.com, John Hubbard <jhubbard@...dia.com>, kirill.shutemov@...ux.intel.com, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhocko@...e.com, 
	npache@...hat.com, Peter Xu <peterx@...hat.com>, ryan.roberts@....com, 
	srivatsa@...il.mit.edu, surenb@...gle.com, vbabka@...e.cz, 
	vishal.moola@...il.com, wangkefeng.wang@...wei.com, will@...nel.org, 
	willy@...radead.org, yang@...amperecomputing.com, zhengqi.arch@...edance.com, 
	Zi Yan <ziy@...dia.com>, zokeefe@...gle.com, 
	Jacob Martin <jacob.martin@...onical.com>, 
	Vanda Hendrychová <vanda.hendrychova@...onical.com>
Subject: Re: [PATCH v2 00/17] khugepaged: Asynchronous mTHP collapse

Hello,

I realize this is an older version of the series, but @Vanda
Hendrychová and I started on a benchmark effort of this version prior
to the most recent revision's introduction and wanted to provide our
results as feedback for this discussion.

For context, my team and I previously identified that some of the
benchmarks outlined in this phoronix benchmark suite [0] perform more
poorly with thp=madvise than thp=always - so I suspected that the
THP=defer and khugepaged collapse functionality outlined in this
article [6] might yield performance in between madvise and always for
the following benchmarks from that suite:
- GraphicsMagick (all tests), which were substantially improved when
switching from thp=madvise to thp=always
- 7-Zip Compression rating, which was substantially improved when
switching from thp=madvise to thp=always
- Compilation time tests, which were slightly improved when switching
from thp=madvise to thp=always

There were more benchmarks in this suite, but these three were the
ones we had previously identified as being significantly impacted by
the thp setting, and thus are the primary focus of our results.

To analyze this, we ran the benchmarks outlined in this article on the
upstream 6.14 kernel with the following configurations:
- linux v6.14 thp=defer-v1: Transparent Huge Pages: defer
- linux v6.14 thp=defer-v2: Transparent Huge Pages: defer
- linux v6.14 thp=always: Transparent Huge Pages: always
- linux v6.14 thp=never: Transparent Huge Pages: never
- linux v6.14 thp=madvise: Transparent Huge Pages: madvise

"defer-v1" refers to the thp collapse implementation by Nico Pache
[3], and "defer-v2" refers to the implementation in this thread [4].
Both use defer as implemented by series [5].


Ultimately, we did observe that some of the GraphicsMagick tests
performed marginally better with Nico Pache's khugepaged collapse
implementation and thp=defer than with just thp=madvise, which aligns
a bit with my theory - however, these improvements unfortunately did
not appear to be statistically significant and gained only marginal
ground in the performance gap between thp=madvise and thp=always in
our workloads of interest.

Results for other benchmarks in this set also did not show any
conclusive performance gains from mTHP=defer (however I was not
expecting those to change significantly with this series, since they
weren’t heavily impacted by thp settings in my prior tests).

I can't speak for the impact of this series on other workloads - I
just wanted to share results for the ones we were aware of and
interested in.

Full results from our tests on the DGX A100 [1] and Lenovo SR670v2 [2]
are linked below.

[0]: https://www.phoronix.com/review/linux-os-ampereone/5
[1]: https://pastebin.ubuntu.com/p/SDSSj8cr6k/
[2]: https://pastebin.ubuntu.com/p/nqbWxyC33d/
[3]: https://lwn.net/ml/all/20250211003028.213461-1-npache@redhat.com
[4]: https://lwn.net/ml/all/20250211111326.14295-1-dev.jain@arm.com
[5]: https://lwn.net/ml/all/20250211004054.222931-1-npache@redhat.com
[6]: https://lwn.net/Articles/1009039/
-- 
Mitchell Augustin
Software Engineer - Ubuntu Partner Engineering

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ