linux-kernel - Re: Bricked x86 CPU with software?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4db2b83b-97a2-4926-e1a2-93a256d625e0@marcan.st>
Date:   Fri, 5 Jan 2018 10:29:25 +0900
From:   Hector Martin 'marcan' <marcan@...can.st>
To:     Tim Mouraveiko <tim.ml@...opper.com>, Pavel Machek <pavel@....cz>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: Bricked x86 CPU with software?

On 2018-01-05 10:21, Tim Mouraveiko wrote:
>> On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
>> Actually... I don't think your code works. That's why I'm curious. But
>> if it works, its rather a big news... and I'm sure Intel and cloud
>> providers are going to be interested.
>>
> 
> I first discovered this issue over a year ago, quite by accident. I changed the code I was 
> working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware 
> of it. They didn´t care much, one of their personnel suggesting that they already knew about it 
> (whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code 
> again. It could be a buggy implementation of a certain x86 functionality, but I left it at that 
> because I had better things to do with my time.
> 
> Now this news came up about meltdown and spectre and I was curious if anyone else had 
> experienced a dead CPU by software, too. Meltdown and spectre are undeniably a problem, 
> but the magnitude and practicality of it is questionable.
> 
> I suspect that what I discovered is either a kill switch, an unintentional flaw that was 
> implemented at the time the original feature was built into x86 functionality and kept 
> propagating through successive generations of processors, or could well be that I have a 
> very destructive and targeted solar flare that is after my CPUs. So, I figured I would put the 
> question out there, to see if anyone else had a similar experience. Putting the solar flare idea 
> aside, I can´t conclusively say whether it is a flaw or a feature. Both options are supported at 
> this time by my observations of the CPU behavior.
> 

If you made Intel aware of the issue a year ago, and they weren't
interested, then the responsible thing to do is disclose the problem
publicly. This is a security issue (if trusted code can brick a CPU,
it's an issue for bare metal hosting providers; if untrusted code can
brick a CPU, it's a *huge* issue for every cloud provider and many, many
others who run code in various sandboxes). If the vendor is not
receptive to coordinated disclosure, the only option is public
disclosure to at least make people aware of the problem and allow for
mitigations to be developed, if possible.

Personally, I would be very interested in seeing such code. We've seen
several ways to brick nonvolatile firmware (writable BIOSes, bad CMOS
data, etc.), but bricking a CPU is a first. The only way that can happen
is either blowing a kill fuse, or causing actual hardware damage, since
CPUs have no nonvolatile memory other than fuses. Either way this would
be a very interesting result.

-- 
Hector Martin "marcan" (marcan@...can.st)
Public Key: https://mrcn.st/pub