[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251009070045.2011920-1-xialonglong2025@163.com>
Date: Thu, 9 Oct 2025 15:00:44 +0800
From: Longlong Xia <xialonglong2025@....com>
To: linmiaohe@...wei.com,
nao.horiguchi@...il.com
Cc: akpm@...ux-foundation.org,
david@...hat.com,
wangkefeng.wang@...wei.com,
xu.xin16@....com.cn,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
Longlong Xia <xialonglong@...inos.cn>
Subject: [PATCH RFC 0/1] mm/ksm: Add recovery mechanism for memory failures
From: Longlong Xia <xialonglong@...inos.cn>
When a hardware memory error occurs on a KSM page, the current
behavior is to kill all processes mapping that page. This can
be overly aggressive when KSM has multiple duplicate pages in
a chain where other duplicates are still healthy.
This patch introduces a recovery mechanism that attempts to migrate
mappings from the failing page to another healthy duplicate within
the same chain before resorting to killing processes.
The recovery process works as follows:
1. When a memory failure is detected on a KSM page, identify if the
failing node is part of a chain (has duplicates) (maybe add dup_haed
item to save head_node to struct stable_node?, saving searching
the whole stable tree, or other way to find head_node)
2. Search for another healthy duplicate page within the same chain
3. For each process mapping the failing page:
- Update the PTE to point to the healthy duplicate page ( maybe reuse
replace_page?, or split repalce_page into smaller function and use the
common part)
- Migrate the rmap_item to the new stable node
4. If all migrations succeed, remove the failing node from the chain
5. Only kill processes if recovery is impossible or fails
The original idea came from Naoya Horiguchi.
https://lore.kernel.org/all/20230331054243.GB1435482@hori.linux.bs1.fc.nec.co.jp/
I test it with /sys/kernel/debug/hwpoison/corrupt-pfn in qemu-x86_64.
here is my test steps and result:
1. alloc 1024 page with same content and enable KSM to merge
after merge (same phy_addr only print once)
a. virtual addr = 0x7e4c68a00000 phy_addr =0x10e802000
b. virtual addr = 0x7e4c68b2c000 phy_addr =0x10e902000
c. virtual addr = 0x7e4c68c26000 phy_addr =0x10ea02000
d. virtual addr = 0x7e4c68d20000 phy_addr =0x10eb02000
2. echo 0x10e802 > /sys/kernel/debug/hwpoison/corrupt-pfn
a. virtual addr = 0x7e4c68a00000 phy_addr =0x10eb02000
b. virtual addr = 0x7e4c68b2c000 phy_addr =0x10e902000
c. virtual addr = 0x7e4c68c26000 phy_addr =0x10ea02000
d. virtual addr = 0x7e4c68d20000 phy_addr =0x10eb02000 (share with a)
3.echo 0x10eb02 > /sys/kernel/debug/hwpoison/corrupt-pfn
a. virtual addr = 0x7e4c68a00000 phy_addr =0x10ea02000
b. virtual addr = 0x7e4c68b2c000 phy_addr =0x10e902000
c. virtual addr = 0x7e4c68c26000 phy_addr =0x10ea02000 (share with a)
d. virtual addr = 0x7e4c68c58000 phy_addr =0x10ea02000 (share with a)
4.echo 0x10ea02 > /sys/kernel/debug/hwpoison/corrupt-pfn
a. virtual addr = 0x7e4c68a00000 phy_addr =0x10e902000
b. virtual addr = 0x7e4c68a32000 phy_addr =0x10e902000(share with a)
c. virtual addr = 0x7e4c68a64000 phy_addr =0x10e902000(share with a)
d. virtual addr = 0x7e4c68a96000 phy_addr =0x10e902000(share with a)
5.echo 0x10e902 > /sys/kernel/debug/hwpoison/corrupt-pfn
MCE: Killing ksm_test:531 due to hardware memory corruption fault at 7e4c68a00000
kernel-log:
Injecting memory failure at pfn 0x10e802
Memory failure: 0x10e802: recovery action for dirty LRU page: Recovered
Injecting memory failure at pfn 0x10eb02
Memory failure: 0x10eb02: recovery action for dirty LRU page: Recovered
Injecting memory failure at pfn 0x10ea02
Memory failure: 0x10ea02: recovery action for dirty LRU page: Recovered
Injecting memory failure at pfn 0x10e902
Memory failure: 0x10e902: recovery action for dirty LRU page: Recovered
MCE: Killing ksm_test:531 due to hardware memory corruption fault at 7e4c68a00000
Thanks for review and comments!
Longlong Xia (1):
mm/ksm: Add recovery mechanism for memory failures
mm/ksm.c | 183 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 183 insertions(+)
--
2.43.0
Powered by blists - more mailing lists