lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 May 2014 08:30:33 +0900
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"Luck, Tony" <tony.luck@...el.com>
Cc:	Andy Lutomirski <luto@...capital.net>,
	Borislav Petkov <bp@...en8.de>, Jiri Kosina <jkosina@...e.cz>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andi Kleen <andi@...stfloor.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC] x86_64: A real proposal for iret-less return to kernel

On Thu, May 22, 2014 at 8:19 AM, Luck, Tony <tony.luck@...el.com> wrote:
>
> Yes. Bystander broadcast machine checks can and will hit processors
> that are in NMI context ... and we must not make that fatal.

.. and this, btw, is just another example of why MCE hardware
designers are f*cking morons that should be given extensive education
about birth control and how not to procreate.

MCE is frankly misdesigned. It's a piece of shit, and any of the
hardware designers that claim that what they do is for system
stability are out to lunch. This is a prime example of what *NOT* to
do, and how you can actually spread what was potentially a localized
and recoverable error, and make it global and unrecoverable.

Can we please get these designers either fired, or re-educated?
Because this shit has been going on too long. I complained about this
to Tony many years ago, and nothing was ever fixed.

Synchronous MCE's are fine for synchronous errors, but then trying to
turn them "synchronous" for other CPU's (where they *weren't*
synchronous errors) is a major mistake. External errors punching
through irq context is wrong, punching through NMI is just
inexcusable.

If the OS then decides to take down the whole machine, the OS - not
the hardware - can choose to do something that will punch through
other CPU's NMI blocking (notably, init/reset), but the hardware doing
this on its own is just broken if true.

Anyway, I repeat: I refuse to fix hardware bugs. As far as we are
concerned, this is "best effort", and the hardware designers should
take a long deep look at their idiotic schemes. If something punches
through NMI, it's deadly. It's that simple.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ