lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150717045221.GA3798@otc-brkl-03.jf.intel.com>
Date:	Fri, 17 Jul 2015 00:52:21 -0400
From:	"Raj, Ashok" <ashok.raj@...el.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Borislav Petkov <bp@...e.de>, Ingo Molnar <mingo@...nel.org>,
	Tony Luck <tony.luck@...el.com>, X86-ML <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Chen Gong <gong.chen@...ux.intel.com>,
	Aravind Gopalakrishnan <Aravind.Gopalakrishnan@....com>,
	Oleg Nesterov <oleg@...hat.com>,
	linux-edac <linux-edac@...r.kernel.org>,
	Ashok Raj <ashok.raj@...el.com>
Subject: Re: [PATCH 7/7] x86/mce: Clear Local MCE opt-in before kexec

On Thu, Jul 16, 2015 at 06:16:50PM -0700, Andy Lutomirski wrote:
> > From: Ashok Raj <ashok.raj@...el.com>
> >
> > kexec could boot a kernel that could be legacy with no knowledge of
> > LMCE. Hence we should make sure we clear LMCE optin before kexec reboot.
> >
> 
> What happens if an offline-but-not-unplugged CPU gets an MCE?  Or does
> this code also clear CR4.MCE?

kexec doesn't use cpu_offline() path, but uses an IPI to all threads
before letting the BSP jump to new kernel.

In this patch, we only turned off the LMCE opt-in. CR4.MCE isn't touched.

if an offline-but-not-unplugged CPU gets an MCE its usually fatal and will
be broadcast to all cpus in the system.

Turning off CR4.MCE would not be good, since any thread that receives an MCE
and has CR4.MCE=0 would result in resetting the whole system.

There are other bugs in MCE in the offline path that i'm working on to send a 
patch update.

for e.g. one such bug is that during CPU_DOWN_PREPARE mce_disable_cpu() 
turns off MCx_CTL().

Machine check banks in uncore are visible to all logical cpus. We should not 
clear them. Today offlining a single cpu would disable MCE generation for any
of the uncore banks. I have them brewing in a test, should release in a couple
weeks or so. 

We can only clear banks if they are only thread local during cpu_offline(). 
We don't have such banks today (but coming). Most banks are either core scoped 
or socket scoped.

Cheers,
Ashok
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ