linux-kernel - RE: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred errors into the genpool.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3908561D78D1C84285E8C5FCA982C28F39EA08E1@ORSMSX114.amr.corp.intel.com>
Date:	Tue, 24 Nov 2015 15:51:21 +0000
From:	"Luck, Tony" <tony.luck@...el.com>
To:	Borislav Petkov <bp@...en8.de>
CC:	"Chen, Gong" <gong.chen@...ux.intel.com>,
	"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred
 errors into the genpool.

>> Ok ... applied those two on top of my "UNTESTED" patch and injected an error to force a UCNA log.
>
> Ok, what error type is that in EINJ nomenclature? I had only
>
> /sys/kernel/debug/apei/einj/available_error_type:0x00000002     Processor Uncorrectable non-fatal
> /sys/kernel/debug/apei/einj/available_error_type:0x00000008     Memory Correctable
> /sys/kernel/debug/apei/einj/available_error_type:0x00000010     Memory Uncorrectable non-fatal
>
> and I would've guessed it is the 0x10 type, i.e., the memory
> uncorrectable which is non-fatal - assuming here - but that one got
> promoted to a #MC on my box.

I juggled with the type of the injection and the instruction sequence to access the target
location.  I used 0x10 to inject an uncorrected memory error with "# echo 1 > notrigger"
to make sure the EINJ driver skipped the trigger actions. Then I had a user mode test program
write a byte to the cache line.  That pulled the uncorrected data into the cache (which logged
the UCNA error signaled with CMCI). But the processor didn't actually consume the poison
(no registers had corrupted data), so there was no machine check.

Sneaky, huh?

-Tony