[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250211060200.33845-1-xueshuai@linux.alibaba.com>
Date: Tue, 11 Feb 2025 14:01:56 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: tony.luck@...el.com,
	bp@...en8.de,
	nao.horiguchi@...il.com
Cc: tglx@...utronix.de,
	mingo@...hat.com,
	dave.hansen@...ux.intel.com,
	x86@...nel.org,
	hpa@...or.com,
	linmiaohe@...wei.com,
	akpm@...ux-foundation.org,
	linux-edac@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	baolin.wang@...ux.alibaba.com,
	tianruidong@...ux.alibaba.com
Subject: [PATCH v1 0/4] fmm/hwpoison: Fix regressions in memory failure handling
This patch addresses three regressions identified in memory failure
handling, as discovered using ras-tools[1]:
- `./einj_mem_uc copyin -f`
- `./einj_mem_uc futex -f`
- `./einj_mem_uc instr`
The regressions in the copyin and futex cases were caused by the
replacement of `EX_TYPE_UACCESS` with `EX_TYPE_EFAULT_REG` in some
copy-from-user operations, leading to kernel panics. The instr case
regression resulted from the PTE entry not being marked as hwpoison,
causing the system to send unnecessary SIGBUS signals.
These fixes ensure proper handling of memory errors and prevent kernel
panics and unnecessary signal dispatch.
[1]https://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git
Shuai Xue (4):
  x86/mce: Collect error message for severities below MCE_PANIC_SEVERITY
  x86/mce: dump error msg from severities
  x86/mce: add EX_TYPE_EFAULT_REG as in-kernel recovery context to fix
    copy-from-user operations regression
  mm/hwpoison: Fix incorrect "not recovered" report for recovered clean
    pages
 arch/x86/kernel/cpu/mce/core.c     | 19 +++++++++++++------
 arch/x86/kernel/cpu/mce/severity.c | 21 ++++++++++++++++-----
 mm/memory-failure.c                |  5 ++---
 3 files changed, 31 insertions(+), 14 deletions(-)
-- 
2.39.3
Powered by blists - more mailing lists
 
