lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Jun 2023 08:18:41 -0500
From:   Yazen Ghannam <yazen.ghannam@....com>
To:     <linux-edac@...r.kernel.org>
CC:     <linux-kernel@...r.kernel.org>, <tony.luck@...el.com>,
        <x86@...nel.org>, Yazen Ghannam <yazen.ghannam@....com>
Subject: [PATCH 2/2] x86/mce: Set correct PPIN for CPER decoding

Scalable MCA systems may report errors found during boot-time polling
through the ACPI Boot Error Record Table (BERT). The errors are logged
in an "x86 Processor" Common Platform Error Record (CPER). The format of
the x86 CPER does not include a logical CPU number, but it does provide
the logical APIC ID for the logical CPU. Also, it does not explicitly
provide MCA error information, but it can share this information using
an "MSR Context" defined in the CPER format.

The MCA error information is parsed by
1) Checking that the context matches the Scalable MCA register space.
2) Finding the logical CPU that matches the logical APIC ID from the
   CPER.
3) Filling in struct mce with the relevant data and logging it.

All the above is done when the BERT is processed during late init. This
can be scheduled on any CPU, and it may be preemptible.

This results in two issues.
1) mce_setup() includes a call to smp_processor_id(). This will throw a
   warning if preemption is enabled.
2) mce_setup() will pull info from the executing CPU, so some info in
   struct mce may be incorrect for the CPU with the error. For example,
   in a dual-socket system, an error logged in socket 1 CPU but
   processed by a socket 0 CPU will save the PPIN of the socket 0 CPU.

Fix the second issue by using the PPIN value cached during CPU init.

Fixes: 4a24d80b8c3e ("x86/mce, cper: Pass x86 CPER through the MCA handling chain")
Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
Cc: stable@...r.kernel.org
---
 arch/x86/kernel/cpu/mce/apei.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 2a7a51ca2995..db16dc3c7b03 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -103,11 +103,13 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 
 	m.extcpu = -1;
 	m.socketid = -1;
+	m.ppin = 0;
 
 	for_each_possible_cpu(cpu) {
 		if (cpu_data(cpu).initial_apicid == lapic_id) {
 			m.extcpu = cpu;
 			m.socketid = cpu_data(m.extcpu).phys_proc_id;
+			m.ppin = cpu_data(m.extcpu).ppin;
 			break;
 		}
 	}
-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ