lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251116014721.1561456-1-jiaqiyan@google.com>
Date: Sun, 16 Nov 2025 01:47:19 +0000
From: Jiaqi Yan <jiaqiyan@...gle.com>
To: nao.horiguchi@...il.com, linmiaohe@...wei.com, ziy@...dia.com
Cc: david@...hat.com, lorenzo.stoakes@...cle.com, william.roche@...cle.com, 
	harry.yoo@...cle.com, tony.luck@...el.com, wangkefeng.wang@...wei.com, 
	willy@...radead.org, jane.chu@...cle.com, akpm@...ux-foundation.org, 
	osalvador@...e.de, muchun.song@...ux.dev, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
	Jiaqi Yan <jiaqiyan@...gle.com>
Subject: [PATCH v1 0/2] Only free healthy pages in high-order HWPoison folio

At the end of dissolve_free_hugetlb_folio that a free HugeTLB
folio becomes non-HugeTLB, it is released to buddy allocator
as a high-order folio, e.g. a folio that contains 262144 pages
if the folio was a 1G HugeTLB hugepage.

This is problematic if the HugeTLB hugepage contained HWPoison
subpages. In that case, since buddy allocator does not check
HWPoison for non-zero-order folio, the raw HWPoison page can
be given out with its buddy page and be re-used by either
kernel or userspace.

Memory failure recovery (MFR) in kernel does attempt to take
raw HWPoison page off buddy allocator after
dissolve_free_hugetlb_folio. However, there is always a time
window between dissolve_free_hugetlb_folio frees a HWPoison
high-order folio to buddy allocator and MFR takes HWPoison
raw page off buddy allocator.

One obvious way to avoid this problem is to add page sanity
checks in page allocate or free path. However, it is against
the past efforts to reduce sanity check overhead [1,2,3].

Introduce hugetlb_free_hwpoison_folio to solve this problem.
The idea is, in case a HugeTLB folio for sure contains HWPoison
page, first split the non-HugeTLB high-order folio uniformly
into 0-order folios, then let healthy pages join the buddy
allocator while reject the HWPoison ones.

I tested with some test-only code [4] and hugetlb-mfr [5], by
checking the stats of pcplist and freelist immediately after
hugetlb_free_hwpoison_folio. After dealing with HugeTLB folio
that contains 3 HWPoison raw pages, the pages used to be in
folio becomes one of the four states:

* Some pages can still be in zone->per_cpu_pageset (pcplist)
  because pcp-count is not high enough.

* Many others are, after merging, in some order's
  zone->free_area[order].free_list (freelist).

* There may be some pages in neither pcplist nor freelist.
  My best guest is they are allocated already.

* 3 HWPoison pages are checked in neither pcplist nor freelist.

For example:

* When hugepagesize=2M, 509 0-order pages are all placed in
pcplist, and no page from the hugepage is in freelist.

* When hugepagesize=1G, in one of the tests, I observed that
  262069 pages are merged to buddy blocks of order 0 to 10,
  72 are in pcplist, and 3 HWPoison ones are isolated.

[1] https://lore.kernel.org/linux-mm/1460711275-1130-15-git-send-email-mgorman@techsingularity.net/
[2] https://lore.kernel.org/linux-mm/1460711275-1130-16-git-send-email-mgorman@techsingularity.net/
[3] https://lore.kernel.org/all/20230216095131.17336-1-vbabka@suse.cz
[4] https://drive.google.com/file/d/1CzJn1Cc4wCCm183Y77h244fyZIkTLzCt/view?usp=sharing
[5] https://lore.kernel.org/linux-mm/20251116013223.1557158-3-jiaqiyan@google.com

Jiaqi Yan (2):
  mm/huge_memory: introduce uniform_split_unmapped_folio_to_zero_order
  mm/memory-failure: avoid free HWPoison high-order folio

 include/linux/huge_mm.h |  6 ++++++
 include/linux/hugetlb.h |  4 ++++
 mm/huge_memory.c        |  8 ++++++++
 mm/hugetlb.c            |  8 ++++++--
 mm/memory-failure.c     | 43 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 67 insertions(+), 2 deletions(-)

-- 
2.52.0.rc1.455.g30608eb744-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ