lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 May 2018 11:22:21 -0700
From:   "Raj, Ashok" <ashok.raj@...el.com>
To:     Borislav Petkov <bp@...e.de>
Cc:     Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Qiuxu Zhuo <qiuxu.zhuo@...el.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, Ashok Raj <ashok.raj@...el.com>
Subject: Re: [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown
 source" message

On Mon, May 28, 2018 at 10:49:23PM +0200, Borislav Petkov wrote:
> On Fri, May 25, 2018 at 02:41:55PM -0700, Tony Luck wrote:
> > @@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
> >  			no_way_out = worst >= MCE_PANIC_SEVERITY;
> >  	} else {
> >  		/*
> > -		 * Local MCE skipped calling mce_reign()
> > -		 * If we found a fatal error, we need to panic here.
> > +		 * If there was a fatal machine check we should have
> > +		 * already called mce_panic earlier in this function.
> > +		 * Since we re-read the banks, we might have found
> > +		 * something new. Check again to see if we found a
> > +		 * fatal error. We call "mce_severity()" again to
> > +		 * make sure we have the right "msg".
> >  		 */
> > -		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
> > -			mce_panic("Machine check from unknown source",
> > -				NULL, NULL);
> > +		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
> > +			severity = mce_severity(&m, cfg->tolerant, &msg, true);
> > +			mce_panic("Local fatal machine check!", &m, msg);

If this doesn't affect mcelog parsing, would it make sense to change this from
"fatal" -> "Unrecoverable".. Fatal typically screams PCC=1 for x86, but
some of these cases are its Software recoverable, but just that Kernel 
isn't able to perform recovery.


> 
> Haha, this would still make you look at the code to remember was it
> "fatal local" or "local fatal" the second one. Yeah, there's the "!" but
> still.
> 
> How about:
> 
> 	"Fatal local machine check after banks scan"
> 
> or so.
> 
> Btw, the code in do_machine_check() has become one helluva spaghetti
> mess. It could use some clean up a bit... :)
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> -- 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ