[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP145phdemqK0hCSw9m5E2TZbrBojnzqAVZ97O5APtLA6Abr7g@mail.gmail.com>
Date: Fri, 26 Feb 2016 20:59:59 +0100
From: Robert Święcki <robert@...ecki.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Hurley <peter@...leysoftware.com>,
Jiri Slaby <jslaby@...e.cz>,
Greg KH <gregkh@...uxfoundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
stable <stable@...r.kernel.org>, lwn@....net,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: BUG: unable to handle kernel paging request from pty_write [was:
Linux 4.4.2]
2016-02-26 20:44 GMT+01:00 Linus Torvalds <torvalds@...ux-foundation.org>:
>> I've contacted Robert Święcki (who found the microcode problem) in
>> case he wants to weigh in in this thread.. He was talking to some AMD
>> people, but I don't know the exactly who.
>
> And since it's looking increasingly likely that it's the same issue,
> I'm adding Robert here explicitly to the cc so that he sees the
> thread...
Thx,
Some data I was able to gather:
It happens only with 0x6000832 ucode, and Piledriver-based CPUs: i.e.
newer AMD FX, and Opteron 300 series (4300, 6300 etc.).
The visible effects are in ~80% of cases incorrect RSP leading to bad
'rets' into kernel data/bss or stack-protector faults. But there are
also more elusive ones, like registers being cleared before use in
indirect memory fetches or so.
I can trigger it from within qemu guest (non-root), causing bad RIP in
the host kernel. When testing, a couple of times (maybe 2) out of
maybe 30 seen oopses, I was able to set it to user-space addresses
mapped in the guest. It greatly depends on timing, but I think with
some more effort and populating kernel stack with guest addresses it'd
be possible to create a more reliable qemu-guest to host ring0 escape.
I CC'd some AMD engineers from this list, and on of them replied with
"We are working on the final testing of a new microcode patch to
replace 0x06000832."
but without specifying any errata no, or ETA for the new ucode.
I can only now suggest not using 0x06000832 is possible (i.e. if it's
not embedded in BIOS), I tested a few from
http://www.amd64.org/microcode.html and only this version seemed
vulnerable.
PS. There's a bug on vmware pages -
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2061211
- which looks very similar to this problem (affects Opteron 6300 which
is Piledriver-based), and it was "somehow" patched by vmware in their
kernel. It points to AMD errata #815 -
http://support.amd.com/TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf -
but I cannot tell whether it's really the same problem, or whether it
can be somehow by-passed on the kernel side.
--
Robert Święcki
Powered by blists - more mailing lists