lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20231026003903.382885-1-zhiquan1.li@intel.com>
Date:   Thu, 26 Oct 2023 08:39:03 +0800
From:   Zhiquan Li <zhiquan1.li@...el.com>
To:     x86@...nel.org, linux-edac@...r.kernel.org,
        linux-kernel@...r.kernel.org, patches@...ts.linux.dev,
        bp@...en8.de, mingo@...nel.org, tony.luck@...el.com,
        naoya.horiguchi@....com
Cc:     dan.carpenter@...aro.org, zhiquan1.li@...el.com,
        Youquan Song <youquan.song@...el.com>
Subject: [PATCH v5] x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel

Memory errors don't happen very often, especially fatal ones. However,
in large-scale scenarios such as data centers, that probability
increases with the amount of machines present.

When a fatal machine check happens, mce_panic() is called based on the
severity grading of that error. The page containing the error is not
marked as poison.

However, when kexec is enabled, tools like makedumpfile understand when
pages are marked as poison and do not touch them so as not to cause
a fatal machine check exception again while dumping the previous
kernel's memory.

Therefore, mark the page containing the error as poisoned so that the
kexec'ed kernel can avoid accessing the page.

Co-developed-by: Youquan Song <youquan.song@...el.com>
Signed-off-by: Youquan Song <youquan.song@...el.com>
Signed-off-by: Zhiquan Li <zhiquan1.li@...el.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@....com>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Dan Carpenter <dan.carpenter@...aro.org>

---

V4: https://lore.kernel.org/all/20231023042237.173290-1-zhiquan1.li@intel.com/

V4 -> V5:
- Fixed the bug reported by Dan Carpenter, that was introduced at V3.
  Link: https://lore.kernel.org/all/12eb6db6-bc24-4e7c-af34-a5c68d49d52a@moroto.mountain/
- Many thanks to Boris for re-writing the commit message and comment.

V3: https://lore.kernel.org/all/20231014051754.3759099-1-zhiquan1.li@intel.com/

V3 -> V4:
- Rebased to v6.6-rc7.
- Added the check if kexec is enabled highlighted by Boris.
- Re-wrote the commit message suggested by Tony.

V2: https://lore.kernel.org/all/20230914030539.1622477-1-zhiquan1.li@intel.com/

V2 -> V3:
- Rebased to v6.6-rc5.
- Moved the logic of function mce_set_page_hwpoison_now() into
  mce_panic().
- Explained full scenario in commit message per Boris's suggestion.
- Included Ingo's fixes.
  Link: https://lore.kernel.org/all/ZRsUpM%2FXtPAE50Rm@gmail.com/

V1: https://lore.kernel.org/all/20230127015030.30074-1-tony.luck@intel.com/

V1 -> V2:
- Revised the commit message as per Naoya's suggestion.
- Replaced "TODO" comment in code with comments based on mailing list
  discussion on the lack of value in covering other page types.
- Added the tag from Naoya.
  Link: https://lore.kernel.org/all/20230327083739.GA956278@hori.linux.bs1.fc.nec.co.jp/
---
 arch/x86/kernel/cpu/mce/core.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 6f35f724cc14..20ab11aec60b 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -44,6 +44,7 @@
 #include <linux/sync_core.h>
 #include <linux/task_work.h>
 #include <linux/hardirq.h>
+#include <linux/kexec.h>
 
 #include <asm/intel-family.h>
 #include <asm/processor.h>
@@ -233,6 +234,7 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
 	struct llist_node *pending;
 	struct mce_evt_llist *l;
 	int apei_err = 0;
+	struct page *p;
 
 	/*
 	 * Allow instrumentation around external facilities usage. Not that it
@@ -286,6 +288,20 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
 	if (!fake_panic) {
 		if (panic_timeout == 0)
 			panic_timeout = mca_cfg.panic_timeout;
+
+		/*
+		 * Kdump skips the poisoned page in order to avoid
+		 * touching the error bits again. Poison the page even
+		 * if the error is fatal and the machine is about to
+		 * panic.
+		 */
+		if (kexec_crash_loaded()) {
+			if (final && (final->status & MCI_STATUS_ADDRV)) {
+				p = pfn_to_online_page(final->addr >> PAGE_SHIFT);
+				if (p)
+					SetPageHWPoison(p);
+			}
+		}
 		panic(msg);
 	} else
 		pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ