linux-kernel - Re: [RFC] x86_64: A real proposal for iret-less return to kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFyD7wzt_Ev1yc3R+WnFNCBodGkEw5-qLsbhFMPQoFnEWw@mail.gmail.com>
Date:	Thu, 22 May 2014 09:03:34 +0900
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Borislav Petkov <bp@...en8.de>
Cc:	"Luck, Tony" <tony.luck@...el.com>,
	Andy Lutomirski <luto@...capital.net>,
	Jiri Kosina <jkosina@...e.cz>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andi Kleen <andi@...stfloor.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC] x86_64: A real proposal for iret-less return to kernel

On Thu, May 22, 2014 at 8:51 AM, Borislav Petkov <bp@...en8.de> wrote:
>
> Regardless, exceptions like MCE cannot be held pending and do pierce the
> NMI handler on both.

No, that's fine, if it's a thread-synchronous thing (ie a memory load
that causes errors). But for NMI handlers, that is irrelevant: if the
NMI code itself gets memory errors, the machine really is dead. Let's
face it, we're going to panic and reboot, there's no other real
alternative (other than the "just log it, pray, and continue in
unstable mode", which is actually a perfectly valid alternative in
many cases, since people don't necessarily care deeply and have
written their distributed algorithms to not rely on any particular
thread too  much, and will verify the end results anyway).

The problem is literally the non-synchronous things (like another CPU
having problems) where things like broadcast will actually turn a
non-thread-synchronous thing into problems for other CPU's. Then, a
user-mode memory access error (that we *can* recover from, perhaps by
killing the process and isolating the page) can turn into a
unrecoverable error on another CPU because it got interrupted at a
point where it really couldn't afford to be interrupted.

It appears Intel is fixing their braindamage.

                      Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/