linux-kernel - Re: BUG: unable to handle kernel paging request from pty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56D096E4.3010006@hurleysoftware.com>
Date:	Fri, 26 Feb 2016 10:18:12 -0800
From:	Peter Hurley <peter@...leysoftware.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Jiri Slaby <jslaby@...e.cz>, Greg KH <gregkh@...uxfoundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	stable <stable@...r.kernel.org>, lwn@....net,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: BUG: unable to handle kernel paging request from pty_write [was:
 Linux 4.4.2]

On 02/26/2016 10:05 AM, Linus Torvalds wrote:
> On Fri, Feb 26, 2016 at 9:52 AM, Peter Hurley <peter@...leysoftware.com> wrote:
>>
>> So more analysis would seem to confirm that RSP has been bumped +8
>> while in ttwu_stat() so when the epilog executed, register restore
>> was off by 1 qword. However, there's nothing in ttwu_stat() that
>> results in stack pointer offset by +1 qword from prolog.
> 
> I agree.
> 
> That's why I'm actually starting to suspect that it's an AMD microcode
> bug that we know very little about. There's apparently register
> corruption (the guess being from NMI handling, but virtualization was
> also involved) under some circumstances.

Yep, that could explain it.

> Of course, if Jiri isn't actually running this on an AMD CPU, that
> theory flies right out the window.

I'll wait for Jiri to confirm before sinking more time here.


> But we do have a reported oops on
> the security list that looks totally different in the big picture, but
> shares the exact same "corrupted stack pointer register state
> resulting in crazy instruction pointer, resulting in NX fault"
> behavior in the end.
> 
> In the other case, microcode patchlevel 0x0600081c was fine, and
> 0x06000832 is the one exhibiting the corruption problem.
> 
> I've contacted Robert Święcki (who found the microcode problem) in
> case he wants to weigh in in this thread.. He was talking to some AMD
> people, but I don't know the exactly who.

Ok, thanks for the info.