lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160918183905.GB331@nazgul.tnic>
Date:   Sun, 18 Sep 2016 20:39:05 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     Yinghai Lu <yinghai@...nel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
        Yinghai Lu <yinghai.lu@...cle.com>
Subject: Re: [RFC PATCH] x86: Do not panic if mce=2 is passed

On Fri, Sep 16, 2016 at 08:28:44PM +0000, Luck, Tony wrote:
> > For UE recovery support, current we need mce=2 in command line
> > and also disable panic_on_oops with sysctl.
> 
> Please explain. I've never given mce=2 on command line, and have
> had my kernel recover from thousands of (injected) UE memory errors.

So frankly, that panic_on_oops doesn't make a whole lotta sense to me.

It is promoting MCEs with severity MCE_UC_SEVERITY and higher to a
panic.

So let's look at those:

	MCE_UC_SEVERITY,	- we don't do anything special in the kernel for
				those so just as well.
	MCE_AR_SEVERITY,	- those end up in the memory failure code if
				they're memory errors
	MCE_PANIC_SEVERITY,	- causes panic

so if anything, panic_on_oops shouldn't control the panicking behavior
as tolerant does that already:

	 * Tolerant levels:
	 * 0: always panic on uncorrected errors, log corrected errors
	 * 1: panic or SIGBUS on uncorrected errors, log corrected errors
	 * 2: SIGBUS or log uncorrected errors (if possible), log corr. errors
	 * 3: never panic or SIGBUS, log all errors (for testing only)

IOW, I think that patch makes sense but please doublecheck my logic
above first.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ