linux-kernel - [PATCH RFC 0/1] mm/ksm: Add recovery mechanism for memory failures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251009070045.2011920-1-xialonglong2025@163.com>
Date: Thu,  9 Oct 2025 15:00:44 +0800
From: Longlong Xia <xialonglong2025@....com>
To: linmiaohe@...wei.com,
	nao.horiguchi@...il.com
Cc: akpm@...ux-foundation.org,
	david@...hat.com,
	wangkefeng.wang@...wei.com,
	xu.xin16@....com.cn,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	Longlong Xia <xialonglong@...inos.cn>
Subject: [PATCH RFC 0/1] mm/ksm: Add recovery mechanism for memory failures

From: Longlong Xia <xialonglong@...inos.cn>

When a hardware memory error occurs on a KSM page, the current
behavior is to kill all processes mapping that page. This can
be overly aggressive when KSM has multiple duplicate pages in
a chain where other duplicates are still healthy.

This patch introduces a recovery mechanism that attempts to migrate
mappings from the failing page to another healthy duplicate within
the same chain before resorting to killing processes.

The recovery process works as follows:
1. When a memory failure is detected on a KSM page, identify if the
failing node is part of a chain (has duplicates) (maybe add dup_haed
item to save head_node to struct stable_node?, saving searching
the whole stable tree, or other way to find head_node)
2. Search for another healthy duplicate page within the same chain
3. For each process mapping the failing page:
- Update the PTE to point to the healthy duplicate page ( maybe reuse
replace_page?, or split repalce_page into smaller function and use the
common part)
- Migrate the rmap_item to the new stable node
4. If all migrations succeed, remove the failing node from the chain
5. Only kill processes if recovery is impossible or fails

The original idea came from Naoya Horiguchi.
https://lore.kernel.org/all/20230331054243.GB1435482@hori.linux.bs1.fc.nec.co.jp/

I test it with /sys/kernel/debug/hwpoison/corrupt-pfn in qemu-x86_64.
here is my test steps and result:

1. alloc 1024 page with same content and enable KSM to merge
after merge (same phy_addr only print once)
 a. virtual addr = 0x7e4c68a00000  phy_addr =0x10e802000
 b. virtual addr = 0x7e4c68b2c000  phy_addr =0x10e902000
 c. virtual addr = 0x7e4c68c26000  phy_addr =0x10ea02000
 d. virtual addr = 0x7e4c68d20000  phy_addr =0x10eb02000

2. echo 0x10e802 > /sys/kernel/debug/hwpoison/corrupt-pfn
 a. virtual addr = 0x7e4c68a00000  phy_addr =0x10eb02000
 b. virtual addr = 0x7e4c68b2c000  phy_addr =0x10e902000
 c. virtual addr = 0x7e4c68c26000  phy_addr =0x10ea02000
 d. virtual addr = 0x7e4c68d20000  phy_addr =0x10eb02000 (share with a)

3.echo 0x10eb02 > /sys/kernel/debug/hwpoison/corrupt-pfn
 a. virtual addr = 0x7e4c68a00000  phy_addr =0x10ea02000
 b. virtual addr = 0x7e4c68b2c000  phy_addr =0x10e902000
 c. virtual addr = 0x7e4c68c26000  phy_addr =0x10ea02000 (share with a)
 d. virtual addr = 0x7e4c68c58000  phy_addr =0x10ea02000 (share with a)

4.echo 0x10ea02 > /sys/kernel/debug/hwpoison/corrupt-pfn
 a. virtual addr = 0x7e4c68a00000  phy_addr =0x10e902000
 b. virtual addr = 0x7e4c68a32000  phy_addr =0x10e902000(share with a)
 c. virtual addr = 0x7e4c68a64000  phy_addr =0x10e902000(share with a)
 d. virtual addr = 0x7e4c68a96000  phy_addr =0x10e902000(share with a)

5.echo 0x10e902 > /sys/kernel/debug/hwpoison/corrupt-pfn
MCE: Killing ksm_test:531 due to hardware memory corruption fault at 7e4c68a00000

kernel-log:
Injecting memory failure at pfn 0x10e802
Memory failure: 0x10e802: recovery action for dirty LRU page: Recovered
Injecting memory failure at pfn 0x10eb02
Memory failure: 0x10eb02: recovery action for dirty LRU page: Recovered
Injecting memory failure at pfn 0x10ea02
Memory failure: 0x10ea02: recovery action for dirty LRU page: Recovered
Injecting memory failure at pfn 0x10e902
Memory failure: 0x10e902: recovery action for dirty LRU page: Recovered
MCE: Killing ksm_test:531 due to hardware memory corruption fault at 7e4c68a00000

Thanks for review and comments!

Longlong Xia (1):
  mm/ksm: Add recovery mechanism for memory failures

 mm/ksm.c | 183 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 183 insertions(+)

-- 
2.43.0