lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210823204122.GA1640015@agluck-desk2.amr.corp.intel.com>
Date:   Mon, 23 Aug 2021 13:41:22 -0700
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     x86@...nel.org, linux-edac@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Sumanth Kamatala <skamatala@...iper.net>
Subject: [PATCH v2] x86/mce: Defer processing early errors until
 mcheck_late_init()

When a fatal machine check results in a system reset, Linux does
not clear the error(s) from machine check bank(s).

Hardware preserves the machine check banks across a warm reset.

During initialization of the kernel after the reboot, Linux reads,
logs, and clears all machine check banks.

But there is a problem. In:
commit 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
the call to mce_register_decode_chain() moved later in the boot sequence.
This means that /dev/mcelog doesn't see those early error logs.

This was partially fixed by:
commit cd9c57cad3fe ("x86/MCE: Dump MCE to dmesg if no consumers")

which made sure that the logs were not lost completely by printing
to the console. But parsing console logs is error prone. Users
of /dev/mcelog should expect to find any early errors logged to
standard places.

Delay processing logs until after all built-in code has had a chance
to register on the mce notifier chain (modules are still out of luck,
there's not way to know how long to wait for those to load).

Fixes: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
Reported-by: Sumanth Kamatala <skamatala@...iper.net>
Signed-off-by: Tony Luck <tony.luck@...el.com>
---
 arch/x86/kernel/cpu/mce/core.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 22791aadc085..593af202f586 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -129,6 +129,8 @@ static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
  */
 BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
 
+static bool mce_init_complete;
+
 /* Do initial initialization of a struct mce */
 noinstr void mce_setup(struct mce *m)
 {
@@ -155,7 +157,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
 
 void mce_log(struct mce *m)
 {
-	if (!mce_gen_pool_add(m))
+	if (!mce_gen_pool_add(m) && mce_init_complete)
 		irq_work_queue(&mce_irq_work);
 }
 EXPORT_SYMBOL_GPL(mce_log);
@@ -2771,6 +2773,8 @@ static int __init mcheck_late_init(void)
 
 	mcheck_debugfs_init();
 
+	mce_init_complete = true;
+
 	/*
 	 * Flush out everything that has been logged during early boot, now that
 	 * everything has been initialized (workqueues, decoders, ...).
-- 
2.29.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ