lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 9 Sep 2017 19:05:37 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Markus Trippelsdorf <markus@...ppelsdorf.de>
Cc:     Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Tom Lendacky <thomas.lendacky@....com>
Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g.
 perf

On Sat, Sep 09, 2017 at 06:32:25PM +0200, Markus Trippelsdorf wrote:
> Also tried the following patch. It does not help.

Ok, another theory. This one still needs to be fixed properly but that
for later.

For some reason (insufficient coffee maybe), I have mistyped your
MCi_STATUS value earlier. Your mail says it is "fa000010000b0c0f". Do
you still have a screen photo to verify it?

Because if so, the correct error type is:

MC4_STATUS[Val|Over|UC|EN|MiscV|PCC|EEC: Protocol error (link, L3, probe filter) (0x0b)|ET: BUS(pp:OBS;t:NOTIMOUT;r4:GEN;ii:GEN;ll:LG)]: 0xfa000010000b0c0f

And for that I'd need the MC4_ADDR value too.

So can you please apply the patch below ontop of the syncflood quirk
patch and retrigger, make a photo of the MCE and send it to me?

Thanks.

---
commit e84e5ad290c7c26af69a721148f404766529509b
Author: Borislav Petkov <bp@...e.de>
Date:   Sat Sep 9 00:55:50 2017 +0200

    x86/MCE/AMD: Collect error info even if valid bits are not set
    
    The MCA banks log error info into MCA_ADDR, MCA_MISC0, and MCA_SYND even
    if the corresponding valid bits are not set:
    
    "Error handlers should save the values in MCA_ADDR, MCA_MISC0,
    and MCA_SYND even if MCA_STATUS[AddrV], MCA_STATUS[MiscV], and
    MCA_STATUS[SyndV] are zero."
    
    Do so by setting those bits so that code down the MCE processing path
    doesn't need to be changed.
    
    Signed-off-by: Borislav Petkov <bp@...e.de>

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3b413065c613..c63c7ef326c7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -436,6 +436,20 @@ static inline void mce_gather_info(struct mce *m, struct pt_regs *regs)
 		if (mca_cfg.rip_msr)
 			m->ip = mce_rdmsrl(mca_cfg.rip_msr);
 	}
+
+	/*
+	 * Error handlers should save the values in MCA_ADDR, MCA_MISC0, and
+	 * MCA_SYND even if MCA_STATUS[AddrV], MCA_STATUS[MiscV], and
+	 * MCA_STATUS[SyndV] are zero.
+	 */
+	if (m->cpuvendor == X86_VENDOR_AMD) {
+		u64 status = MCI_STATUS_ADDRV | MCI_STATUS_MISCV;
+
+		if (mce_flags.smca)
+			status |= MCI_STATUS_SYNDV;
+
+		m->status |= status;
+	}
 }
 
 int mce_available(struct cpuinfo_x86 *c)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ