lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180823122410.GA17393@nazgul.tnic>
Date:   Thu, 23 Aug 2018 14:24:10 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     "Ghannam, Yazen" <Yazen.Ghannam@....com>
Cc:     "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "tony.luck@...el.com" <tony.luck@...el.com>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH 1/2] Revert "x86/mce/AMD: Collect error info even if
 valid bits are not set"

Reviving an old issue while cleaning my inbox.

On Tue, Mar 27, 2018 at 03:59:37PM +0000, Ghannam, Yazen wrote:
> > > On Mon, Mar 26, 2018 at 07:58:51PM +0000, Ghannam, Yazen wrote:
> > > > So at a minimum, we should always save and report as much as we can.
> > >
> > > Only on Zen or all AMD families?
> > >
> > 
> > I'll confirm with the HW folks. I understand it as a change in philosophy
> > rather than a change in hardware.
> > 
> 
> So this recommendation could apply to all families, but it's okay if we just

Ok, so I think we should do this, still, as it is exactly what the
recommendation says: read the MSRs even if the valid bits are not set and it
doesn't set any Valid bits to confuse error handling downstream.

This way we'll collect all possible info and then mce_amd.c should stop looking
at the valid bits too and dump whatever has been logged.

Ok?

---
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 4b767284b7f5..c14674025615 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -639,10 +639,12 @@ static struct notifier_block mce_default_nb = {
  */
 static void mce_read_aux(struct mce *m, int i)
 {
-	if (m->status & MCI_STATUS_MISCV)
+	bool is_amd = m->cpuvendor == X86_VENDOR_AMD;
+
+	if (m->status & MCI_STATUS_MISCV || is_amd)
 		m->misc = mce_rdmsrl(msr_ops.misc(i));
 
-	if (m->status & MCI_STATUS_ADDRV) {
+	if (m->status & MCI_STATUS_ADDRV || is_amd) {
 		m->addr = mce_rdmsrl(msr_ops.addr(i));
 
 		/*
@@ -668,7 +670,7 @@ static void mce_read_aux(struct mce *m, int i)
 	if (mce_flags.smca) {
 		m->ipid = mce_rdmsrl(MSR_AMD64_SMCA_MCx_IPID(i));
 
-		if (m->status & MCI_STATUS_SYNDV)
+		if (m->status & MCI_STATUS_SYNDV || is_amd)
 			m->synd = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND(i));
 	}
 }

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ